WORLD IN TELLEC TUAL PROPERTY ORGANIZATION 
International Bureau 




PCT 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 5 t 

C12N 15/00, 15/12, 1/21, C07K 13/00, 
15/28, C12Q 1/68, A01K 67/027 



Al 



(11) International Publication Number: 
(43) International Publication Date: 



WO 94/13794 

23 June 1994 (23.06.94) 



(21) International Application Number: PCT/US93/1 1725 

(22) International Filing Date: 3 December 1993 (03.12.93) 



(30) Priority Data: 

07/985,296 



4 December 1992 (04.12.92) US 



(71) Applicant: BRIGHAM AND WOMEN'S HOSPITAL, INC. 

[US/US]; 75 Francis Street, Boston, MA 02115 (US). 

(72) Inventor: LAWLER, John, W.; 657 Humphrey Street, Swamp- 

scott, MA 01907 (US). 

(74) Agent: GATES, Edward, R.; Wolf, Greenfield & Sacks, 600 
Atlantic Avenue, Boston, MA 02210 (US). 



(81) Designated States: AU, CA, JP, European patent (AT, BE, 
CH, DE, DK, ES, FR, GB, GR, IE, IT, LU, MC, NL, FT, 
SE). 



Published 

With international search report. 

Before the expiration of the time limit for amending the 
claims and to be republished in the event of the receipt of 
amendments. 



(54) Title: HUMAN THROMBOSPONDIN-4 




CD 




PROCOLLAGEN HOMOLOGY 



TYPE 1 REPEATS 



AMINO-TERMINAL 




TYPE 2 REPEATS 



TYPE 3 REPEATS 



CARBOXY-TERMINAL 



(57) Abstract 

A novel member of the thrombospondin gene family, thrombospondin-4, has been cloned and sequenced. A frog thrombospondin-4 
DNA and the mammalian homolog of the frog DNA are disclosed. Recombinant vectors and cells are described. Methods of providing 
isolated thrombospondin-4 DNA and polypeptide sequences are disclosed, as well as methods of making transgenic animals containing, or 
lacking, the thrombospondin-4 gene. 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international 
applications under the PCT. 



AT 


Austria 


GB 


United Kingdom 


MR 


Mauritania 


AU 


Australia 


GE 


Georgia 


MW 


Malawi 


BB 


Barbados 


GN 


Guinea 


NE 


Niger 


BE 


Belgium 


GR 


Greece 


NL 


Netherlands 


BF 


Burkina Faso 


HU 


Hungary 


NO 


Norway 


BG 


Bulgaria 


IE 


Ireland 


NZ 


New Zealand 


BJ 


Brain 


IT 


Italy 


PL 


Poland 


BR 


Brazil 


JP 


Japan 


PT 


Portugal 


BY 


Belarus 


KE 


Kenya 


RO 


Romania 


CA 


Canada 


KG 


Kyrgystan 


RU 


Russian Federation 


CF 


Central African Republic 


KP 


Democratic People' 8 Republic 


SD 


Sudan 


CG 


Congo 




of Korea 


SE 


Sweden 


CH 


Switzerland 


KR 


Republic of Korea 


SI 


Slovenia 


CI 


Cote d' I voire 


KZ 


Kazakhstan 


SK 


Slovakia 


CM 


Cameroon 


LI 


Liechtenstein 


SN 


Senegal 


CN 


China 


LK 


Sri Lanka 


TD 


Chad 


cs 


Czechoslovakia 


LU 


Luxembourg 


TG 


Togo 


cz 


Czech Republic 


LV 


Latvia 


TJ 


Tajikistan 


DE 


Germany 


MC 


Monaco 


TT 


Trinidad and Tobago 


DK 


Denmark 


MD 


Republic of Moldova 


UA 


Ukraine 


ES 


Spain 


MG 


Madagascar 


US 


United States of America 


FI 


Finland 


ML 


Mali 


uz 


Uzbekistan 


FR 


France 


MN 


Mongolia 


VN 


Viet Nam 


GA 


Gabon 











WO 94/13794 PCT/US93/11725 

-1- 



HUMAN THROMBOS POND I N— 4 

This invention made with U.S. Government Support under 
National Institutes of Health Grant No. NIH: HL28749. The 
U.S. Government has certain rights to this invention. 

BACKGROUND OF THE INVENTION 
Platelet thrombospondin is a glycoprotein that is 
structurally and functionally similar to the adhesive 
glycoproteins found in a wide variety of cells. The 
thrombospondin genes encode two distinct polypeptides, 
designated thrombospondin -1 and -2 (Bornstein et al . , 
Biol. Chem. , 266 : 12821-12824 , 1991; and 265:16691-16698, 
1991; Proc. Nat. Acad. Sci. USA 88:8636-8640 (1990); Wolf et 
al . , Genomics , 6:685-691 1990)). Thrombospondin-3 is a 
recently discovered member of the thrombospondin gene family 
(Vos et al. J. Biol Chem , 267 : 12192-12196 (1992)). 

Partial or complete cDNA sequences are available for 
human, mouse and frog thrombospondin -1 , and human, mouse and 
chicken thrombospondin- 2 (Lawler and Hynes, J . Cell Biol . , 
103 : 1635-1648 ; (1986); Bornstein et al . , supra ; Lawler et 
al . , J. Biol . Chem. , 266 : 8039-8043 (1991); Genomics , 11: 
587-600, (1991). The overall molecular architecture of 
thrombospondin-l and 2 are substantially the same. The 
predicted amino acid sequences of thrombospondins-1 and -2 
are very similar in their repeat sequences and their 
COOH-terminal domains . 

The central portion of platelet thrombospondin is 
composed of mutiple copies of structural motifs found in 
other proteins (Lawler and Hynes, supra 1986). Amino acid 
sequences that have been shown to mediate cellular attachment 
are also present in the central portion of the molecule (Rich 
et al . , Science , 249 . 1574-1577 (1990)). In addition, 
thrombospondin contains a region that is rich in calcium 
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binding sites and that contains the RGD sequence that 
promotes adhesion of some cell types (Lawler et al . , (1988)). 

Thrombospondin has been shown to modulate its attachment 
to a variety of cell types in vitro . The NH2-terminal 
heparin-binding domain binds to proteoglycans including 
syndecan and to cell surface sulfatides; (Sun et al . , J . 
Biol . Chem. , 264:2885-1889 (1989)). Thrombospondin also 
interacts with CD36 or platelet glycoprotein IV (Stromski et 
al. , Exp. Cell Res. , 198:85-92 (1992)). Several integrin 
receptors have been reported to bind thrombospondin (Lawler 
et al . , supra (1988)). These integrin receptors are reported 
to be involved in neurite outgrowth (Neugebauer, et al . , 
Neuron , b:345-358 (1991)). Through these, and yet to be 
identified interactions, thrombospondin can modulate cell 
adhesion, cell migration, angiogenesis and neurite outgrowth. 

The human platelet thrombospondins 1 and 2 that have 
already been characterized in the prior art are schematically 
illustrated in FIG. 1. The term "thrombospondin" refers to 
adhesive glycoproteins of about 420 , 000-dalton molecular 
weight that are involved in modulation of cell growth and 
migration. Thrombospondins are composed of three 
polypeptides linked by disulfide bonds. The N-terminal end 
binds with heparin, the C-terminal end assists in platelet 
aggregation . 

Three types of internal repeating structures are found in 
human thrombospondin-1 and thrombospondin-2 polypeptides. 
These are the type 1, 2 and 3 domains ("repeats"). In 
addition to the three types of domains, thrombospondins 1 and 
2 also contain a region of homology to procollagen, as well 
as amino and carboxyl-termini . 

Human thrombospondins-1 and -2 have three, type 1 
domains. Type 1 domains are homologous to several of the 
complement factors, including C-8, C-9 and properdin. Type 1 
domains are also found in two proteins produced in 
malaria-parasitized blood cells. These are circumsporozoite 
protein and the thrombospondin related anonymous protein 
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(Robson et al . , Nature 335 : 79-82, (1988)). Three copies of 
type 1 domains are also found in the UNC-5 gene of C. elegans 
(Culotti, et al. J. Cell Biol . 115: 1229, (1991)). The type 

_ ,-1-1-7 0 

1 domains of thrombospondin- 1 and -2 extend from nucleic acid 
number 1210 to 1719 (Lawler and Hynes, J. Cell Biol. 103: 
1635 (1986)). 

Human thrombospondins-1 and 2 have three, type 2 
domains. Type 2 domains are similar to epidermal growth 
factor (EGF) in that they are framed around a characteristic 
spacing of six cysteines. Multiple copies of EGF repeat are 
commonly found in adhesive glycoproteins and cell adhesion 
molecules. The type 2 domains extend from nucleic acid 
sequence 1720to 2151 on thrombospondins-1 and -2. 

SUMMARY OF THE INVENTION 
According to one aspect of the invention, an isolated 
nucleotide sequence encoding a new member of the 
thrombospondin family, thrombospondin-4 , or unique fragments 
of thrombospondin-4, is provided. One embodiment is an 
isolated DNA sequence encoding a thrombospondin, that has at 
least four, type 2 domains. In another embodiment, the 
sequence encodes a thrombospondin that lacks any type 1 
domains. A further embodiment is a sequence encoding a 
thrombospondin that lacks a region of homology with 
procollagen. Yet another embodiment is a sequence that 
encodes a thrombospondin that has four, type 2 domains, lacks 
type 1 domains and lacks a region of homology to 
procollagen. The preferred DNA of the present invention is a 
human homolog of thrombospondin-4. Additionally, the 
invention relates to vertebrate thrombospondin-4 genes 
isolated from porcine, ovine, bovine, feline, avian, equine, 
or canine, as well as primate sources and any other species 
in which thrombospondin-4 structure exists. 

Also provided are recombinant cells and plasmids 
containing the foregoing isolated DNA, preferably linked to a 
promoter. Portions of the foregoing nucleotide sequences are 
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also included in the invention. One such portion is 
contained in a vector within a host cell. 

According to another aspect of the invention, isolated 
thrombospondin protein is provided, having at least four type 
2 domains. Other thrombospondins lack any type 1 domains 
and/or lack any procollagen homology. Portions of the 
foregoing isolated thrombospondin proteins are also included 
in the invention. Antibodies with selective binding 
specificity for the thrombospondin protein of the invention 
also are provided. 

Another aspect of the invention is a method for producing 
thrombospondin polypeptide. The method includes providing an 
expression vector to a host, the vector containing a DNA 
sequence of the invention having at least four, type 2 
domains; allowing the host to express the thrombospondin, and 
isolating the expressed thrombospondin. 

A further aspect of the invention is a probe capable of 
distinguishing thrombospondin-4 from thrombospondins -1, -2, 
and -3. The probe can include a nucleotide sequence encoding 
a thrombospondin-4 polypeptide with at least four, type 2 
domains, that lacks any type 1 domains, and lacks a region of 
homology to procollagen. The nucleotide sequence also can 
encode a thrombospondin-4 polypeptide having sequences unique 
to the polypeptide. 

Also provided is a thrombospondin-4 polypeptide having a 
restricted range of expression in tissues. The preferred 
polypeptide is expressed in human heart and skeletal muscle, 
but is not expressed in human placenta, liver or kidney. 

The novel molecules of the invention can be employed in 
experimental or therapeutic protocols. For example, a method 
for interfering with the activity of a thrombospondin-4 gene 
may be accomplished by providing a construct arranged to 
include a thrombospondin nucleotide sequence which, when 
inserted, inactivates either transcription of messenger for 
thrombospondin-4 and/or inactivates translation of messenger 
into thrombospondin-4 protein. This construct further has a 
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promotor operatively linked to the sequence. Next, the 
construct is introduced into a cell, and the construct is 
allowed to homologously recombine with complementary 
sequences of the cell genome. Finally, cells lacking the 
ability to transcribe thrombospondin-4 are selected. 

These and other aspects of the invention as well as 
various advantages in the utilities will be more apparent 
with reference to the detailed description of the invention 
when taken in connection with the accompanying drawings. It 
is to be understood that the drawings are designed for the 
purpose of illustration only and are not intended as a 
definition of the limits of the invention. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. l: Schematic drawing of human thrombospondin-l and 
thrombospondin-2 . 

FIG. 2: Schematic drawing of human thrombospondin-4 The 
drawing schematically depicts an actual nucleotide sequence 
of 3120 nucleotides, with a message of approximately 3.3 kb, 

FIG. 3: Alignment of restriction fragments of Xenopus 
thrombospondin-4 clones. Restriction endonuclease sites are 
indicated for the two families (TSP-4A and TSP—4B) . The 
clones that have been isolated in the first (XF1-XF4 ) , second 
(XS5-XS10) and third (XT11-XT14) rounds of screening have 
been grouped into their appropriate family by restriction 
endonuclease mapping and nucleotide sequencing. 

Fig. 4: Photograph of a Northern blot of Xenopus stage 17 
RNA probed with the XF3 clone of Fig. 3. Two micrograms of 
total stage 17 mRNA was electrophoresed and blotted. 
Positions and sizes of markers are shown on the left. 

FIG. 5: The expression of thrombospondin-4 in adult human 
tissue. A northern blot of poly A + RNA from adult human 
heart (a), brain (b) , placenta (c), lung (d) , liver (e), 
skeletal muscle (f), kidney (g) and pancreas (h) . The blot 
was probed with a 2 . 2 kb fragment of Xenopus 
thrombospondin-4. The positions and sizes (kb) of the 
markers are indicated on the left. 
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DETAILED DESCRIPTION OF THE INVENTION 

Human thrombospondins 1 and 2 have seven, type 3 
domains. Type 3 domains extend from nucleic acid 2221 to 
2926 on thrombospondins -1 and -2. Type 3 domains include a 
large number of calcium-binding sites. The consensus 
sequence of these type 3 domains is similar to calcium 
binding site sequences of calmodulin, parvalbumin and 
fibrinogen beta and gamma subunits. (Lawler and Hynes, 
supra ) . In particular, there are aspartic acid residues at 
positions 6, 8, 10, 14 and 17 of the type 3 domains, as well 
as a second set at positions 21, 23, 25, 29 and 32. 
Moreover, glycine residues at positions 11 and 26 are also 
homologous with calcium-binding sites of calmodulin and 
paravalbumin. Lawler and Hynes, supra . To date, no other 
protein has been identified that could potentially bind as 
much calcium as thrombospondin . Furthermore, no other 
protein has been identified in which the calcium binding 
sites are contiguous. The thrombospondins of the invention, 
like other thrombospondins characterize to date (i.e. 
thrombospondins -1 -2), have an N-terminal region that is 
more than 20 0 amino acids in length. In thrombospondins -3 
and -4, which lack procollagen and type 1 domains, this 
N-terminal region preceeds the type 2 domains . In 
thrombospondins -1 and -2, this N-terminal region preceeds 
both the procollagen and type 1 domains. 

Thrombospondins -1 and -2 also have a region adjacent the 
N-terminal end that is substantially homologous to the known 
sequence of procollagen. This region extends from 
nucleotides 916 to 1209 on thrombospondins -1 and -2. 

The novel member of the thrombospondin family, 
hereinafter called "thrombospondin-4 u has the schematic 
structure depicted in FIG. 2. 

In complete contrast to human thrombospondins 1 and 2, 
thrombospondin-4 lacks type 1 domains entirely. 
Thrombospondin-4 also lacks a region homologous to 
procollagen, in contrast to the known thrombospondins 1 and 
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2. The molecular architecture of much of the N-terminal end 
of thrombospondin-4 is thus distinct from that of human 
thrombospondins 1 or 2. 

Moreover, thrombospondin-4 has four, type 2 domains 
(FIG. 2) in contrast to thrombospondins -1 and -2 which have 
three, type 2 domains (see FIG. 1). 

Thrombospondin-4 has the same number of calcium-binding 
sites located within the type 3 domains as do thrombospondins 
1 and 2 . 

The configuration and number of repeats, as well as the 
lack of procollagen homology and lack of type 1 domains, 
define the unique thrombospondin-4 structure. 

One embodiment of a thrombospondin-4 molecule, according 
to the invention, is the isolated nucleotide sequence shown 
in SEQ ID NO.: 1. By "isolated" it is meant a nucleic acid 
sequence: (i) amplified in vitro by, for example, polymerase 
chain reaction (PCR); (ii) synthesized by, for example, 
chemical synthesis; (iii) recombinantly produced by cloning; 
or (iv) purified, as by cleavage and gel separation. The 
term "isolated" is also meant to include polypeptides encoded 
by isolated nucleic acid sequences, as well as polypeptides 
synthesized by, for example, chemical synthetic methods, and 
polypeptides separated from biological materials, and then 
purified using conventional* protein analytical procedures. 

SEQ ID NO.: 1 is a thrombospondin-4 that has been 
isolated from the frog, Xenopus laevis . 

An open reading frame of 889 amino acids is predicted from 
the Xenopus nucleotide sequence. The deduced amino acid 
sequence encoded by the Xenopus thrombospondin-4 DNA sequence 
is given in SEQ ID NO.: 2. The first 216 amino acids of 
Xenopus thrombospondin-4 have little homology with human 
thrombospondins 1 and 2, primarily because of the lack of 
type 1 repeats and the lack of procollagen sequence in 
Xenopus thrombospondin-4 . 

Four adjacent type 2 domains can be identified in Xenopus 
thrombospondin-4 on the basis of the positions of the 
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cysteine residues. The overall homology with other 
thrombospondins is low in this type 2 region, and the 
introduction of several gaps is necessary to optimize the 
alignment. The second of the type 2 domains is, however, 
similar to those of thrombospondins -1 and -2, in that 
thirteen residues are inserted between the last two cysteine 
residues. The amino acid sequence for the four type 2 
domains of thrombospondin-4 are shown below in Table 1. 

Table l: TYPE 2 DOMAINS OF THROMBOSPONDIN-4 

PRCDATS CFRGVRC I DTEGGFQ-CGPCPEGYTG NGVICTDV 

DECRL — NP-CFLGVRCINTSPGFK-CESCPPGYTGSTIQGIGINFAKQNKQVCTDT 

NECENGRNGGCTSNSLCINTMGSFR— CGGCKPGYVG DQIKGCKPE 

KSCRHGQNP-CHASAQCSEEKVGDVTCT-CSVGWAG NGYLCGK 

The type 3 domains of Xenopus thrombospondin-4 are 61.4% 
identical to the type 3 domains of human thrombospondins 1 
and 2. The consensus sequence and overall organization of 
the seven, type 3 repeats of Xenopus thrombospondin-4 are 
equivalent to those of thrombospondins-1 and -2, with the 
second and fourth type 3 domains being truncated after the 
second cysteine. Thrombospondin-4, however, contains 4 amino 
acids (PPGP) at the end of the sixth, type 3 domain that do 
not align with sequences on thrombospondins -1 and -2. 
Further thrombospondin-4 does not contain an RGD sequence. 
The seven, type 3 domains of Xenopus thrombospondin-4 are 
shown below in Table 2 . 

The consensus sequence for Xenopus is compared to that 
for human and mouse thrombospondin -1 and chicken 
thrombospondin -2 at the bottom of Table 2. The underline 
indicates that an N occupies one of the positions that is 
occupied by a D . 
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Table 2: TYPE 3 DOMAINS OF THROMBOSPONDIN-4 

DNCVYVPNSGQEDTDKDNI GDACDE — DADGDG I LNEQ 
DNC VL AAN I DQKNSDQD I FGD AC 

DNCRLTLNNDQRDTDNDGKGDACDD — DMDGDGIKNIL 
DNCQRVPNVDQKDKDGDGVGDIC 

DSCPDIIMPNQSDIDNDLVGDSCDTNQDSDGDGHQDST 
DNCPTVINSNQLDTDKDGIGDECDD — DDDNDGIPDTVPPGP 
DNCKLVPNPGQEDDNNDGVGDVCEA — DFDQDTVIDRI 

D.C. . . .N, .Q.D.D.D. .GD.C. . . .D.D.D Consensus 

DNC. . . .N. .Q.D.D.D. .GD.C. . . .D.D.D TSP-1 and 2 Consensus 

Alignment of the carboxyl-terminal of the Xenopus 
thrombospondin-4 sequence with the last 227 amino acids of 
human thrombospondin-1 reveals that 60.8% of the amino acids 
are identical and no insertions or deletions are required. 
SEQ ID NO. : 2 extends 15 amino acids beyond the stop codon 
for human thrombospondin-1 . 

A particularly preferred embodiment of a thrombospondin-4 
molecule has the nucleotide sequence shown in SEQ ID NO. : 3. 
This is a human homolog of the Xenopus sequence containing 
about 4 5 more amino acids at the amino-terminal end than the 
Xenopus sequence of SEQ ID NO.: 2. Approximately the first 
10 nucleotides in SEQ ID NO.: 3 are linkers from the cloning 
library and are not thrombospondin-4 sequence. An open 
reading frame that is about 900 amino acids long (SEQ ID 
NO.: 4) is predicted from the nucleotide sequence of this 
human homolog. 

It is not yet proved that the methionine at the 5' end of 
SEQ ID NO. : 3 and 4 is the beginning of the coding region. 
The methionine is close to the 5' end and the sequence that 
follows represents a reasonable signal sequence. 
Nevertheless, the molecular architecture of the human homolog 
is substantially identical to that of Xenopus 
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thrombospondin-4 . That is, the human homolog nucleotide 
sequence has the same structure as Xenopus thrombospondin-4 
(i.e. lacks type 1 repeats and procollagen homology, has 
four, type 2 repeats, and has seven, type 3 repeats). 

The pattern of expression of thrombospondin-4 in human 
tissues is markedly different from the pattern of expression 
of thrombospondin 1 and 2 in human tissues. Northern blots 
of poly A+ selected RNA from adult human tissues was 
performed and probed with Xenopus thrombospondin-4 and the 
human homolog of Xenopus thrombospondin-4 . Thrombospondin-4 
showed high levels of expression in human heart and skeletal 
muscle (Example 3). No expression was detected in the 
placenta, liver or kidney. Thrombospondin-3 had its 
strongest Northern blot signal in the lung. The adult lung 
also produced the strongest signal when a blot was probed 
with thrombospondin-1 (Example 3). Thus, the tissue 
distribution of thrombospondin-4 appears to be quite 
different from thrombospondins 1 and 3. 

Using the nucleotide sequence information provided in SEQ 
ID NO. : 1 and 3, cell lines expressing the thrombospondin-4 
proteins can be established (Example 4). Likewise, homologs 
to SEQ ID NO.: 3 of other vertebrate (i.e., mammalian) 
species can be identified using conventional techniques, 
described in greater detail below. Such genetic engineering 
techniques are well within the scope of those of ordinary 
skill in the art. 

The human gene encoding thrombospondin-4 has been cloned, 
isolated and expressed. A general protocol is present 
below. This protocol is intended to obtain a cDNA having a 
complete reading frame for the human homolog of Xenopus 
laevis thrombospondin-4. This objective is achieved by 
generating a probe to the human homolog, screening a human 
cDNA library with the probe and, finally, generating a coding 
sequence from the sequence identified in the library. 
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A. Cloning Xenopus laevis Thrombospondin 

A cDNA encoding Xenopus thrombospondin-4 was cloned by 
first isolating mouse thrombospondin-l , chicken 
thrombospondin-2 and Xenopus thrombospondin-l clones by 
screening libraries with existing probes for other species at 
low stringency. The resulting sequences for these 
thrombospondin members were aligned with human 
thrombospondin-l and highly conserved regions were 
identified. Based on these sequences, degenerate 
oligonucleotides were synthesized and used as primers for the 
polymerase chain reaction (PGR) (SEQ ID NO. : 5 and 6; Example 
1A). 

The preferred primer sequences fall in the type 3 repeat 
domain and the carboxyl terminus of the molecule. SEQ ID 
NO. : 5 depicts the sequence of the forward primers and SEQ ID 
NO.: 6 depicts the sequence of the reverse primers. 

Polymerase chain reaction (PCR) was run using Xenopus 
laevis cDNA as a template. PCR products were sized, 
fractionated and subcloned into plasmid vectors. To complete 
the sequence and establish the validity of the Xenopus 
thrombospondin-4 clone, the Xenopus cDNA library was screened 
using the PCR products as the probes. The probes were 
labelled and hybridization performed. Plaques were purified 
and amplified to yield high* titre plate stocks. Restriction 
fragments were then subcloned. Sequencing was then performed 
using well known methods (e.g.,. chain termination method: 
Sanger et al . , see Example IB). 

Xenopus laevis clones (designated XS3 and XS9 : see 
Example IB) were used to determine the nucleotide sequence of 
Xenopus thrombospondin-4 on both strands. Since XS9 is still 
650 bp smaller than the message size predicted by Northern 
Blot analysis, two approaches were used to complete the 
sequence: (i) the Xenopus cDNA library was rescreened; and 
(ii) two PCR primers that include sequences within the 5" end 
have were used in conjunction with two PCR primers from the 
polylinker to perform PCR on the library. The PCR protocol 
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was that described in Example 1A. Neither approach yielded 
any additional sequences. 

B. Cloning the Gene for Human Thrombospondin-4 

The approach used to screen a DNA library for the 
presence of a thrombospondin-4 coding sequence corresponding 
to a human homolog includes generating preferred probes using 
the polymerase chain reaction. The probes were produced by 
using a human heart cDNA library as a template for primers 
(SEQ ID NO. : 7 and 8). Based on the degree of codon 
degeneracy of the predicted amino acid sequence, primers were 
derived from the Xenopus thrombospondin-4 sequence of SEQ ID 
NO . : 1 and 2 . 

The product of the PCR reaction was cloned and the human 
heart cDNA library rescreened using the PCR product as the 
probe(s) (Example 3). This preferred method required 
identifying tissue that expresses thrombospondin-4 as a 
source of RNA (e.g.. human heart tissue). 

Other tissues expressing the human homolog can, however, 
be identified by RNA analysis, i.e.,. Northern analysis under 
low stringency conditions. Confirmation of a human tissue as 
an RNA source and identification of additional sources of 
tissue can be accomplished by preparing RNA from the selected 
tissue and performing Northern Blot Analysis under low 
stringency conditions using PCR product as the probe(s). A 
suitable range of such stringency conditions is described in 
Krause, M.H. . and Aaronson, S.A., 1991, Methods in Enzymology 
200: 546-556. Additionally, genomic libraries can be 
screened for the presence of the human homolog coding 
sequence using a PCR generated probe(s). 

C . Testing and Cloning Related Thrombospondin-4 
Molecules 

The invention also pertains to a more general protocol 
for isolating the gene for thrombospondin-4 from vertebrates, 
in particular from non-human vertebrates such as cows, pigs, 
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monkeys and the like. In this approach, total mRNA can be 
isolated from mammalian tissues or from cell lines likely to 
express thrombospondin-4 (e.g. . cow or chimpanzee, heart 
muscle). In general, total RNA from the selected tissue or 
cell culture is isolated using conventional methods. 
Subsequent isolation of mRNA is typically accomplished by 
oligo (dT) chromotography . RNA for Northern analysis is 
size-fractionated by electrophoresis and the RNA transcripts 
are transferred to nitrocellulose according to conventional 
protocols (Sambrook, J. et al . , Molecular Cloning , Cold 
Spring Harbor Press, N.Y.). 

A labelled PCR-generated probe capable of hybridizing 
with the human homolog of Xenopus thrombospondin-4 (SEQ ID 
NO.: 3) can serve to identify RNA transcripts complementary 
to at least a portion of the human thrombospondin-4 gene. 
For example, if Northern analysis indicates that RNA isolated 
from a cow heart muscle hybridizes with the labelled probe, 
then a cow heart muscle cDNA library is a likely candidate 
for screening and identification of a clone containing the 
coding sequence for a cow homolog of thrombospondin-4 . 

Northern analysis is used to confirm the presence of mRNA 
fragments which hybridize to a probe corresponding to all or 
part of thrombospondin-4. Northern analysis indicates the 
presence and size of the transcript. This allows one to 
determine whether a given cDNA clone is long enough to 
encompass the entire transcript or whether it is necessary to 
obtain further cDNA clones, i.e.,. if the length of the cDNA 
clone is less than the length of RNA transcripts as seen by 
Northern analysis. If the cDNA is not long enough, it is 
necessary to perform several steps such as: (i) rescreen the 
same library with the longest probes available to identify a 
longer cDNA; (ii) screen a different cDNA library with the 
longest probe; and (iii) prepare a primer-extended cDNA 
library using a specific nucleotide primer corresponding to a 
region close to, but not at, the most 5' available region. 
This nucleotide sequence is used to prime reverse 
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transcription. The primer extended library is then screened 
with the probe corresponding to available sequences located 
5' to the primer. See for example, Rupp et al . , Neuron , 6: 
811-823 (1991). 

The preferred clone of thrombospondin-4 has a complete 
coding sequence, i.e.,. one that begins with methionine, ends 
with a stop codon, and preferably has another in-frame stop 
codon 5' to the first methionine. It is also desirable to 
have a cDNA that is "full length", i.e. includes all of the 
5' and 3' untranslated sequences. To assemble a long clone 
from short fragments, the full-length sequence is determined 
by aligning the fragments based upon overlapping sequences. 
Thereafter, the full-length clone is prepared by ligating the 
fragments together using appropriate restriction enzymes. 

As discussed above, PCR-generated probes can be used in 
the protocol for isolating non-human mammalian homologs to 
thrombospondin-4. Moreover, probes to be used in the general 
method for isolating non-human, vertebrate thrombospondin-4 
can now include oligonucleotides, all of which are part of 
the human homolog shown in SEQ ID NO. : 3. Moreover, 
antibodies reactive with this human homolog can also be 
used. Unlike the PCR approach to generating a probe, the 
above-identified probes do not require prior isolation of RNA 
from a tissue expressing the vertebrate homolog. 

In particular, an oligonucleotide probe typically has a 
sequence somewhat longer than that used for the PCR primers. 
A longer sequence is preferable for the probe, and it is 
important that codon degeneracy be minimized. A 
representative protocol for the preparation of an 
oligonucleotide probe for screening a cDNA library is 
described in Sambrook, J. et al . Molecular Cloning , Cold 
Spring Harbor Press, New York, 1989. In general, the probe 
is labelled, e.g.. P-32, and used to screen clones of a cDNA 
or genomic library. 

Alternately, the library can be screened using 
conventional immunization techniques, such as those described 
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in Harlowe and Lane, D. (1988), Antibodies , Cold Spring 
Harbor Press, New York. Antibodies prepared using purified 
thrombospondin-4 as an immunogen are preferably first tested 
for cross reactivity with the homolog of thrombospondin-4 
from other species. Other approaches to preparing antibodies 
for use in screening DNA libraries, as well as for use in 
diagnostic and research applications, are described below. 

D . Nucleic Acid and Protein Sequences 

The nucleic acid sequence of the human thrombospondin-4 
is depicted in SEQ ID NO: 3. This sequence, its functional 
equivalent, or unique fragments of this sequence may be used 
in accordance with the invention. The term "unique 
fragments" refers to portions of the thrombospondin-4 nucleic 
acid sequence that find no counterpart in the known sequences 
of thrombospondins -1 and -2. Subsequences comprising 
hybridizable portions of the thrombospondin-4 sequence have 
use, e.g. . , in nucleic acid hybridization assays, Southern 
and Northern blot analyses, etc. 

Nevertheless, the nucleic acid sequence depicted in SEQ 
ID NO: 3 can be altered by mutations such as substitutions, 
additions or deletions that provide for functionally 
equivalent nucleic acid sequences. According to the present 
invention, a nucleic acid sequence is "functionally 
equivalent" compared with the nucleic acid sequence depicted 
in SEQ ID NO: 3, if it satisfies at least one of the 
following conditions: (i) the nucleic acid sequence has the 
ability to hybridize to thrombospondin-4, but it does not 
necessarily hybridize to thrombospondin-4 with an affinity 
that is the same as that of the natural thrombospondin-4 
nucleic acid sequence; and/or (ii) the nucleic acid can serve 
as a probe to distinguish between thrombospondin-4 and the 
other known thrombospondins. A probe that can "distinguish" 
between thrombospondin-4 and the other thrombospondins refers 
to a probe that will hybridize to a thrombospondin nucleic 
acid sequence that encodes for a polypeptide having has at 
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least four, type 2 domains; that lacks any type 1 domains 
and/or that lacks a region of procollagen homology. The term 
"probe" , therefore, refers to a ligand of known qualities that 
can bind selectively to a target. As applied to the nucleic 
acid sequences of the invention, the term "probe" refers to a 
strand of nucleic acid having a base sequence complementary 
to a target strand. 

Because the nucleic acid sequence of thrombospondin-4 is 
now known, those of ordinary skill in the art can readily 
determine those nucleic acid sequences of thrombospondin-4 
that are not homologous to any other nucleic acid sequence, 
including the other thrombospondin sequences. These 
non-homologous sequences, and peptides encoded by them, are 
referred to as "unique 11 fragments and are meant to be 
included within the scope of the present invention. 

Moreover, due to the degeneracy of nucleotide coding 
sequences, other nucleic acid sequences may be used in the 
practice of the present invention. These include, but are 
not limited to, sequences comprising all or portions of the 
thrombospondin-4 genes depicted in SEQ ID NO: 1 and 3 which 
are altered by the substitution of different codons that 
encode the same amino acid residue within the sequence, thus 
producing a silent change. Such altered sequences are 
regarded as equivalents of the specifically claimed 
sequences . 

Thrombospondin-4 proteins or unique fragments or 
derivatives thereof include, but are not limited to, those 
containing as a primary amino acid sequence all, or unique 
parts of the amino acid residues substantially as depicted in 
SEQ ID NO.: 2 and SEQ ID NO.: 4, including altered sequences 
in which functionally equivalent amino acid residues are 
substituted for residues within the sequence, resulting in a 
silent change. According to the invention, an amino acid is 
"functionally equivalent" compared with the sequences 
depicted in SEQ ID NOS . : 2 and 4 if the amino acid sequence 
contains one or more amino acid residues within the sequence 
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which can be substituted by another amino acid of a similar 
polarity which acts as a functional equivalent. Substitutes 
for an amino acid within the sequence may be selected from 
other members of the class to which the amino acid belongs. 
The non-polar (hydrophobic) amino acids include alanine, 
leucine, isoleucine, valine, proline, phenylalanine, 
tryptophan and methionine. The polar neutral amino acids 
include glycine, serine, threonine, cysteine, tyrosine, 
asparagine, and glutamine. The positively charged (basic) 
amino acids include arginine, lysine and histidine. The 
negatively charged (acidic) amino acids include asparatic 
acid and glutamic acid. 

Also included within the scope of the invention are 
thrombospondin-4 proteins or unique fragments or derivatives 
thereof which are differentially modified during or after 
translation, e.g.. , by phosphorylation, glycosylation, 
crosslinking, acylation, proteolytic cleavage, linkage to an 
antibody molecule, membrane molecule or other ligand, 
(Ferguson et al . , 1988, Ann. Rev. Biochem. 57 : 285-320) . 

In addition, the recombinant thrombospondin-4- encoding 
nucleic acid sequences of the invention may be engineered so 
as to modify processing or expression of thrombospondin-4. 
For example, and not by way of limitation, the 
thrombospondin-4 gene may be combined with a promoter 
sequence and/or a ribosome binding site using well 
characterized methods, and thereby facilitate harvesting or 
bioavailability . 

Additionally, a given thrombospondin-4 can be mutated in 
vitro or in vivo , to create variations in coding regions 
and/or form new restriction endonuclease sites or destroy 
preexisting ones, to facilitate further in vitro 
modification. Any technique for mutagenesis known in the art 
can be used including, but not limited to, in vitro 
site-directed mutagenesis (Hutchinson, et al . , 1978, J . Biol . 
Chem. 253:6551), use of TAB® linkers (Pharmacia), 
PCR-directed mutagenesis, and the like. 



WO 94/13794 



PCT/US93/11725 



-18- 

The thrombospondin-4 of the invention also includes 
non-human homologs of the amino acid sequence of SEQ ID 
NO: 4. The thrombospondin-4 peptides of the invention may be 
prepared by recombinant nucleic acid expression techniques or 
by chemical synthesis using standard peptide synthesis 
techniques . 

Also within the scope of the invention are nucleic acid 
sequences or proteins encoded by nucleic acid sequences 
derived from the same gene but lacking one or more structural 
features (for instance the type 2 or 3 domains) as a result 
of alternative splicing of transcripts from a gene that also 
encodes the complete thrombospondin-4 gene, as defined 
previously. 

Nucleic acid sequences complementary to DNA or RNA 
sequences encoding thrombospondin-4 or a functionally active 
portion thereof are also provided. In animals, particularly 
transgenic animals, RNA transcripts of a desired gene or 
genes may be translated into polypeptide products having a 
host of phenotypic actions. In a particular aspect of the 
invention, antisense thrombospondin-4 oligonucleotides can be 
synthesized. These oligonucleotides may have activity in 
their own right, such as antisense reagents which block 
translation or inhibit RNA function. Thus, where 
thrombospondin-4 is to be produced utilizing the nucleotide 
sequences of this invention, the DNA sequence can be in an 
inverted orientation which gives rise to a negative sense 
("antisense") RNA on transcription. This antisense RNA is 
not capable of being translated to the desired 
thrombospondin-4 product, as it is in the wrong orientation 
and would give a nonsensical product if translated. 

E. Expression of Thrombospondin-4 

The present invention also permits the expression, 
isolation, and purification of the thrombospondin-4 
polypeptide. A thrombospondin-4 gene may be cloned or 
subcloned using any method known in the art. A large number 
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of vector-host systems known in the art may be used. 
Possible vectors include, but are not limited to, cosmids, 
plasmids or modified viruses, but the vector system must be 
compatible with the host cell used. Viral vectors include, 
but are not limited to, vaccinia virus, or lambda 
derivatives. Plasmids include, but are not limited to, 
pBR322, pUC, or Bluescript® (Stratagene) plasmid 
derivatives. Recombinant thrombospondin-4 molecules can be 
introduced into host cells via transformation, transf ection, 
infection, electroporation, etc.. Generally introduction of 
thrombospondin-4 molecules into a host is accomplished using 
a vector containing thrombospondin DNA under control by 
regulatory regions of the DNA that function in the host cell. 

In a preferred method of expressing thrombospondin-4, the 
cDNA that corresponds to the entire coding region of human 
thrombospondin-4, constructed from two overlapping clones, 
was moved to the mammalian expression vector, pLEN-PT (See 
Example 4). The details of the experimental approach for 
transf ection, selection and characterization of the expressed 
thrombospondin-4 protein were similar to those that have been 
used previously for human thrombospondin-1 (see Biochemistry , 
31: 1173-1180 (1992)), the entire contents of which are 
incorporated herein by reference. 

Once the thrombospondin-4 protein is expressed, it may be 
isolated and purified by standard methods including 
chromatography (e.g., ion exchange, affinity, and sizing 
column chromatography) , centr if ugation, differential 
solubility, or by any other standard technique for the 
purification of proteins. In particular, thrombospondin-4 
protein may be isolated by binding to an affinity column 
comprising antibodies to thrombospondin-4 bound to a 
stationary support. 

F . Preparation of Antibodies to Thrombospondin-4 

The term "antibodies" is meant to include monoclonal 
antibodies, polyclonal antibodies and antibodies prepared by 
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recombinant nucleic acid techniques that are selectively 
reactive with thrombospondin-4. The term "selectively 
reactive" refers to those antibodies that react with 
thrombospondin-4 , and do not react with the other 
thrombospondins . Antibodies include antibodies raised 
against Xenopus thrombospondin-4 polypeptide (SEQ ID NO. : 2) 
and intended to cross-react with the human homolog. These 
antibodies are useful for diagnostic applications. Other 
antibodies include antibodies raised against Xenopus 
thrombospondin-4, which antibodies are generally used for 
research purposes. These antibodies include those raised 
against short, synthetic peptides of the Xenopus 
thrombospondin-4 sequence . 

Finally, antibodies are raised against the human homolog 
(SEQ ID NO.: 4), isolated by standard protein purification 
methods. Generally, a peptide immunogen is first attached to 
a carrier to enhance the immunogenic response. Although the 
peptide immunogen can correspond to any portion of the amino 
acid sequence of the human thrombospondin-4 protein or to 
variants of the sequence, such as the amino acid sequences 
corresponding to the primers and probes described, certain 
peptides are more likely than others to provoke an immediate 
response. For example, a peptide including the C-terminal 
amino acid is more likely to generate an antibody response. 

Other alternatives to preparing antibodies reactive with 
the human homolog include: immunizing an animal with a 
protein expressed by a bacterial or eucaryotic cell, which 
cell includes the coding sequence for: (i) all or part of 
the human homolog; or (ii) the coding sequence for all or 
part of the Xenopus thrombospondin-4 protein. 

Antibodies can also be prepared by immunizing an animal 
with whole cells that are expressing all or a part of a cDNA 
encoding the thrombospondin-4 protein. 

To further improve the likelihood of producing an 
anti-thrombospondin-4 immune response, the amino acid 
sequence of thrombospondin-4 may be analyzed in order to 
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identify portions of the molecule which may be associated 
with increased immunogenic ity . For example, the amino acid 
sequence may be subjected to computer analysis to identify 
surface epitopes which present computer-generated plots of 
antigenic index, an amphiphilic helix, amphiphilic sheet, 
hydrophilicity , and the like. Alternatively, the deduced 
amino acid sequences of thrombospondin-4 from different 
species could be compared, and relatively non-homologous 
regions identified. These non-homologous regions would be 
more likely to be immunogenic across various species. 

For preparation of monoclonal antibodies directed toward 
thrombospondin-4, any technique which provides for the 
production of antibody molecules by continuous cell lines and 
culture may be used. For example, the hybridoma technique 
originally developed by Kohler and Milstein ( Nature , 256: 
495-497), as well as the trioma technique, the human B-cell 
hybridoma technique (Kozbor et al . , Immunology Today , 4:72), 
and the EBV-hybridoma technique to produce human monoclonal 
antibodies, and the like, are within the scope of the present 
invention. 

Further, single-chain antibody (SCA) methods are also 
available to form anti-thrombospondin-4 antibodies (Ladner 
et al . , U.S. Patents 4,704,694 and 4,976,778). 

The monoclonal antibodies may be human monoclonal 
antibodies or chimeric human-mouse (or other species) 
monoclonal antibodies. The present invention provides for 
antibody molecules as well as fragments of such antibody 
molecules . 

G. Assays/Utilities 

The present invention provides for assay systems in which 
activity or activities resulting from exposure to a peptide 
or non-peptide compound may be detected by measuring a 
physiological response to the compound in a cell or cell line 
which expresses the thrombospondin-4 molecules of the 
invention. A "physiological response" may comprise any 
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biological response, including but not limited to 
transcriptional activation of certain nucleic acid sequences 
(e.g. . promoter/enhancer elements as well as structural 
genes), translation, or phosphorylation, the induction of 
secondary processes, and morphological changes, such as 
neurite sprouting. 

The present invention thus provides for the development 
of novel assay systems which may be utilized in the screening 
of compounds. Target cells expressing thrombospondin-4 , 
which bind to the compounds, may be produced by transfection 
with thrombospondin-4-encoding nucleic acid. 

Once target cell lines are produced or identified, it may 
be desirable to select for cells which are exceptionally 
sensitive to a particular compound. Such target cells may 
express large amounts of thrombospondin-4; target cells 
expressing a relative abundance of thrombospondin-4 could be 
identified by selecting target cells which bind to high 
levels of the compound, for example cells which, when 
incubated with a compound/tag and subjected to 
immunofluorescence assay, produce a relatively higher degree 
of fluorescence. Alternatively, cell lines which are 
exceptionally sensitive to a compound may exhibit a 
relatively strong biological response, such as a sharp 
increase in immediate early gene products such as c-fos or 
c- jun , in response to thrombospondin-4 binding. By 
developing assay systems using target cells which are 
extremely sensitive to a compound, the present invention 
provides for methods of screening for low levels of 
thrombospondin-4 act ivity . 

In particular, using recombinant DNA techniques, the 
present invention provides for thrombospondin-4 target cells 
which are engineered to be highly sensitive to 
thrombospondin-4 binding compounds. For example, the 
thrombospondin-4 gene, cloned according to the methods set 
forth above, may be inserted into cells which naturally 
express thrombospondin-4 such that the recombinant 
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thrombospondin-4 gene is expressed at high levels. Since 
thrombospondins generally bind large amounts of calcium, 
cells expressing thrombospondin-4 may find used in calcium 
bioassay methods, particularly in clinical settings where 
elevated blood calcium may be indicative of parathyroid or 
bone dysfunction. 

The present invention also provides for experimental 
model systems for studying the physiological role of the 
native thrombospondin-4. In these model systems, 
thrombospondin-4 protein, peptide fragment, or a derivation 
thereof, may be either supplied to the system or produced 
within the system. Such model systems could be used to study 
the effects of thrombospondin-4 excess or depletion. The 
experimental model systems may be used to study the effects 
of increased or decreased response to ligand in cell or 
tissue cultures, in whole animals, or in particular cells or 
tissues within whole animals or tissue culture systems, or 
over specified time intervals (including during 
embryogenesis) . 

In additional embodiments of the invention, a recombinant 
thrombospondin-4 gene may be used to inactivate the 
endogenous gene by homologous recombination, and thereby 
create a thrombospondin-4 deficient cell, tissue, or animal. 
For example, and not by way of limitation, a recombinant 
thrombospondin-4 gene may be engineered to contain an 
insert ional mutation (e.g. . the neo gene) which, when 
inserted, inactivates transcription of thrombospondin-4. 
Such a construct, under the control of a suitable promoter 
operatively linked to the thrombospondin-4 gene, may be 
introduced into a cell by a technique such as transf ection, 
transduction, injection, etc.. In particular, stem cells 
lacking an intact thrombospondin-4 gene may generate 
transgenic animals deficient in thrombospondin-4. In a 
specific embodiment of the invention (See Example 6), the 
endogenous thrombospondin-4 gene of a cell may be inactivated 
by homologous recombination with a mutant thrombospondin-4 
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gene to form a transgenic animal lacking the ability to 
express thrombospondin-4 . In another embodiment, a construct 
can be provided that, upon transcription, produces an 
"anti-sense" nucleic acid sequence which, upon translation, 
will not produce the required thrombospondin-4 protein. 

A "transgenic animal" is an animal having cells that 
contain DNA which has been artificially inserted into a cell, 
which DNA becomes part of the genome of the animal which 
develops from that cell. The preferred DNA encodes for 
thrombospondin-4 and may be entirely foreign to the 
transgenic animal or may be homologous to the natural 
thrombospondin-4 of the transgenic animal, but which is 
inserted into the animal's genome at a location which differs 
from that of the natural homolog. 

In a further embodiment of the invention, 
thrombospondin-4 expression may be reduced by providing 
thrombospondin-4 expressing cells, preferably in a transgenic 
animal, with an amount of thrombospondin-4 anti-sense RNA or 
DNA effective to reduce expression of thrombospondin-4 
protein . 

A transgenic animal (preferably a non-human mammal) can 
also be provided with a thrombospondin-4 DNA sequence that 
also encodes a repressor protein (e.g., the E. coli lac 
repressor). The repressor protein can bind to a specific DNA 
sequence of thrombospondin-4, thereby reducing ("repressing") 
the level of transcription of thrombospondin-4. 

Transgenic animals of the invention which have attenuated 
levels of thrombospondin-4 expression have general 
applicability to the field of transgenic animal generation, 
as they permit control of the level of expression of genes. 

According to the present invention, thrombospondin-4 
probes may be used to identify cells and tissues of 
transgenic animals which lack the ability to transcribe 
thrombospondin-4. Thrombospondin-4 expression may be 
evidenced by transcription of thrombospondin-4 mRNA or 
production of thrombospondin-4 protein, detected using probes 
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which can distinguish thrombospondin-4 from thrombospondins 
-1 and -2, as described above. One variety of probe which 
may be used to detect thrombospondin-4 expression is a 
nucleic acid probe, containing a sequence encoding for at 
least four, type 2 domains. Alternatively, the probe can 
contain a thrombospondin sequence of the invention lacking 
type 1 domains or procollagen homology. Detection of 
thrombospondin-4-encoding mRNA may be easily accomplished by 
any method known in the art, including, but not limited to, 
in situ hybridization, Northern blot analysis, or PCR related 
techniques . 

Another variety of probe which may be used is 
anti-thrombospondin-4 antibody. 

The above-mentioned probes may be used experimentally to 
identify cells or tissues which hitherto had not been shown 
to express thrombospondin-4. Furthermore, these methods may 
be used to identify the expression of thrombospondin-4 by 
aberrant tissues, such as malignancies. 

The invention will be further illustrated by the 
following, non-limiting examples. 

EXAMPLE 1 : Cloning the Xenopus thrombospondin-4 gene 

A: Polymerase Chain Reaction 

Aliquots (1,5 and 25]il) of a Xenopus laevis stage 45 
cDNA library (unpublished) were brought to a final volume of 
71.5pl with H 2 0. The samples were heated to 70 °C for 5 
minutes than cooled on ice. To each sample, lOvil of lOx 
reaction buffer (Cetus), 6\xl of 25 mM MgCl 2 , 16yl of 
dNTPs and 300 pmoles of primer were added (SEQ ID NO.: 5 
and 6) . 

The reaction mixture was heated to 95 °C for 5 minutes and 
then equilibrated to the annealing temperature (37-48°C) . 
TAQ polymerase (2.5 units) was added and the sample was 
heated to 72°C for 3 minutes. The amplification cycles were 
(1) incubate at 94 °C for 1 minute and 20 seconds, (2) 
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incubate at 48°C for 2 minutes, (3) ramp to 72°C over 2 
minutes, and (4) incubate at 72 °C for 3 minutes. This cycle 
was repeated 30-40 times; finally the sample was incubated at 
72°C for 7 minutes. The PCR products were separated by 
agarose gel electrophoresis and the appropriately sized 
products were subcloned into pBluescript KS or SK 
(Stratagene, LaJolla, CA) . 

B : Cloning and Sequencing 

To establish the validity of the thrombospondin-4 clone 
and to complete the sequence, the Xenopus laevis stage 4 5 
library was screened with the PCR product as the probe. The 
probe was labeled with digoxigenin-dUTP, and hybridization 
performed using the Genius Kit® following the supplier's 
protocols (Boehringer Mannheim, Indianapolis, IN). Positive 
plaques were taken through successive rounds of screening 
with the same probe at progressively lower plaque densities. 
The purified plaques were amplified to yield high titre plate 
stocks . 

Because the Xenopus laevis library can be constructed in 
the XZAPII vector pBluescript II SK, the inserts are 
excised with helper phage and grown up directly following the 
supplier's protocols (Stratagene). BamHI and EcoRI fragments 
were subcloned into pBluescript II SK and KS . All sequencing 
was done by the chain termination method of Sanger et al . 
(1977) with Sequenase reagents (U.S.. Biochemical Corp., 
Cleveland, OH) . The ends of all clones and subclones were 
sequenced with the remainder of the sequence being determined 
using synthetic oligonucleotides as primers. The sequence of 
Xenopus thrombospondin-4 was obtained on both strands. 

The largest clone that we obtained from screening the 
Xenopus library was 2.8 kb. To complete the sequence, two 
oligonucleotides that corresponded to the bottom strand 
sequence near the 5' end were synthesized. The 
oligonucleotides and the pBluescript SK and primers 
(Stratagene, LaJolla, CA) were used as PCR primers with the 
library as the template. 
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Degenerate PCR using the Xenopus laevis stage 45 library 
has produced four distinct sequences that are related to the 
thrombospondins . Two of the four sequences correspond to the 
two copies of the thrombospondin-1 gene that are present in 
the Xenopus genome (Urry et al . , supra 1991 ) . In some cases, 
both copies of the gene are expressed (e.g., J. Biol Chem. 
263: 5333-5340, DeSimone and Hynes, 1988). To date, the 
thrombospondin-1 sequences represent the majority of the 
products that we have obtained. However, two PCR products 
comprise sequences that are related to, but clearly distinct 
from thrombospondin-1 . The sequences of these two PCR 
products (labeled TSP-4A and TSP-4B in FIG. 3, below) are 
very similar to each other suggesting that they represent the 
two copies of a newly identified gene in the Xenopus genome. 

To establish that these two new sequences are derived 
from the Xenopus library, and to obtain more nucleotide 
sequence, a probe was prepared from the PCR product and used 
to screen the library. A screen of 120,000 plaques produced 
four positive clones that range in size from 1.7 kb to 2.3 kb 
(FIG. 3, XF1-XF4). As shown in FIG. 3, the restriction maps 
of the clones indicate that two distinct gene products can be 
identified. The longest clone for each gene (XF1 and XF3) 
has been sequenced on both strands. The sequence of the PCR 
products is included in the sequences of these clones. These 
data confirm that the PCR product is derived from the Xenopus 
library and not from another contaminating source. 

When clone XF3 was used to probe a Northern blot of 
Xenopus stage 17 RNA, a 3.3. kb band was observed (FIG. 4). 
Since the message size is greater than the length of clone 
XF3 and the reading frame is open at the 5' end of the 
predicted amino acid sequence for clone XF3 , the library was 
rescreened with the EcoRI fragment of clone XF1 in a second 
round of screening. This screen produced six additional 
clones (XS5-XS10, FIG. 3). Clone XS9 has been sequenced on 
both strands. Clone XS9 is approximately 469 nucleotide 
smaller than the message and the reading frame is open at the 
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5' end of the predicted amino acid sequence. The library was 
rescreened in a third round of screening with the EcoRI to 
BamHI fragment of XS9 . Four additional clones have been 
isolated (XT11-XT14 ) however, they did not contain additional 
nucleotide sequence. To obtain additional 5 l end sequences, 
a Xenopus laevis stage 22 library (a gift of Dr. Douglas 
Melton) was screened. Restriction endonuclease mapping 
indicated that one of the clones (XM15; not shown in Fig. 3) 
contained additional 5' end sequence for the TSP-4B family. A 
single reading frame exists between nucleotides 103 and 2970 
(SEQ ID NO.: 1). There is a short (140 bp) 3' untranslated 
region that ends with a continuous series of adenosines. An 
AATAAA consensus polyadenylation signal is observed upstream 
of the poly A+ sequence. 

Example 2: I solating the human homo log of Xenopus 
thrombospondin--4 

The cloning and nucleotide sequencing of Xenopus laevis 
thrombospondin-4 is described above. The predicted amino 
acid sequence (SEQ ID NO. : 2) has been searched to identify 
regions where the codon degeneracy is low. Two regions have 
been identified and the 89PCR (AAT GAG CAG GAC AAC TGT GT : 
SEQ ID NO. : 7) and 90PCR (TGC TCA GTC TGC TTC CAC AT: SEQ ID 
NO.: 8) oligonucleotides have been constructed. 

Northern blot analysis of eight adult human tissues 
indicated that thrombospondin-4 is expressed in high levels 
in the heart and skeletal muscle (Example 3A) . A heart cDNA 
library (the generous gift of Dr. Paul Allen) has been used 
as the template for polymerase chain reaction (PGR) with the 
primers 89PCR (SEQ ID NO.: 7) and 90PCR (SEQ ID NO.: 8). The 
product of the PCR reaction has been cloned into pBluescript 
vectors (Stratagene) . After nucleotide sequencing to confirm 
that the PCR product corresponds to a sequence similar to 
Xenopus thrombospondin-4, the library has been screened with 
the PCR product as the probe. Clones have been isolated and 
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characterized in terms of the sites for endonuclease and 
nucleotide sequence. The longest clone is approximately 
2kb. Computer-assisted progressive sequence alignment has 
been used to construct a phylogenetic tree of the 
thrombospondin sequences. The results of this analysis are 
consistent with the hypothesis that the clones that have been 
isolated from the human heart library represent the human 
homolog of Xenopus thrombospondin-4 . 

Example 3: Tissue Distribution of Thrombospondin-4 

A. Northern Blot Analysis (General Protocol) 

The Xenopus thrombospondin-4 clone XF3 was digested with 
EcoRI and Xhol and the insert purified. A variety of probes 
were used in the Northern analysis. 

A human thrombospondin-1 probe was the human full-length 
cDNA (Lawler et al . , 1992). A human thrombospondin-3 probe 
was developed as follows: A genomic clone GPEM-2 containing 
human thrombospondin-3 was kindly provided by Dr. Sandra 
Gendler (Imperial Cancer Research Fund, London; Lancaster et 
al . , Biochem. Biophys. Res. Comm., 173: 1019-1-29 1990). 
BamHI fragments of GPEM-2 were subcloned into pBluescript KS 
and the ends of each clone were sequenced. One of these 
clones contained sequences that were homologous to the 3' end 
of thrombospondin-1, 2 and 4. Based on this homology, the 
position of the 5' end of the last exon was determined. The 
3' end of this exon was taken to be the polyadenylation 
signal. Oligonucleotides that primed at the 5' and 3' ends 
of the last exon were used to amplify and clone a 293 bp DNA 
segment that corresponds to the last exon of human 
thrombospondin-3 . 

A third probe was a B-actin probe (Clontech, Palo Alto, 
CA) . The PCR product for the last exon of thrombospondin-3 
and the actin probe were radiolabeled directly with the 
Multiprime DNA Labelling System (Amersham, Arlington Heights, 
IL) . 
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A Northern blot that was prepared with Poly A+ RNA from 
adult human heart, brain, placenta, lung, liver, skeletal 
muscle, kidney and pancreas was obtained from Clontech. The 
blot was prehybridized and hybridized as described previously 
(Lawler and Hynes, supra 1986). 

B . Distribution of Thrombospondin-4 in Adult Human 
Tissues 

A Northern blot of poly A+ selected RNA from eight adult 
human tissues is shown in FIG. 5. The lanes are represented 
as: (a) adult human heart; (b) adult human brain; (c) adult 
human placenta; (d) adult human lung; (e) adult human liver; 
(f) adult human skeletal muscle; (g) adult human kidney; and 
(h) adult human pancreas. The size of the human 
thrombospondin-4 message is 3.4 kb. Thrombospondin-4 (TSP-4) 
showed a restricted pattern of expression as this expression 
is visualized using a 2.2kb fragment of Xenopus 
thrombospondin-4. The positions and sizes of the markers are 
indicated on the left. 

High levels of expression were observed in the heart and 
skeletal muscle (FIG. 5). On longer exposures, a faint band 
was detectable in the tissue from the brain, lung and 
pancreas. No expression was been detected in the placenta, 
liver or kidney. Comparable levels of the 2.0 kb form of 
C-actin were observed in all of the lanes except the pancreas 
(FIG. 5). Because a considerable fraction of the total mRNA 
in the pancreas encodes preproinsulin and a-amylase, other 
mRNAs give a lower hybridization signal. Thus, although the 
thrombospondin-4 signal is weak in the pancreas, the relative 
level of expression may be significant. 

When the same blot is probed for thrombospondin-3 
(TSP-3), the strongest signal was observed in the lung 
(FIG. 5). The size of the thrombospondin-3 message was also 
3.4 kb. Lower levels of thrombospondin-3 " expression were 
observed in most of the lanes with the brain displaying the 
weakest hybridization signal (FIG. 5). The adult lung tissue 
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also produced the strongest signal when the blot was probed 
with a human thrombospondin-1 probe (FIG. 5; TSP-1). Varying 
levels of thrombospondin-1 were observed in all of the 
tissues on the blot. In this case, the principal message was 
6.0 kb with faint bands at 4.5 and 3.6 kb. 

In addition, when a Northern blot was probed with one of 
the clones that has been isolated from the human heart 
library (D7492 #9), the tissue distribution is identical to 
that observed when the Northern blot is probed with the 
Xenopus probe. These data indicate that the clones that have 
been isolated correspond to human thrombospondin-4 . Since 
the Northern blot indicated that the message for human 
thrombospondin-4 is 3.4 kb, we rescreened the library with an 
approximately 450 bp EcoRI to BamHI fragment from the 5' end 
of the known sequence. The new clones provided additional 
sequence so that the total sequence is now 3074 bp. The 5' 
end includes a methionine residue that is followed by a 21 
amino acid sequence that could represent a signal sequence. 

Example 4: Expression of Thrombospondin-4 

Two human thrombospondin-4 clones were used to construct 
a full-length coding region cDNA. An EcoRV fragment of D9892 
#9 containing DNA (corresponding to nucleotides 1639 to 3074 
of SEQ ID NO. 3) was cloned into EcoRv cut D9892 #11 
containing DNA (corresponding to nucleotides 1 to 1638 of SEQ 
ID NO. 3). DNA was made from transf ormants and was cut with 
EcoRI to determine the orientation of the inserted DNA. 
Since the insert co-electrophoresed with the vector, the DNA 
was cut with XmnI followed by EcoRI to purify a full-length 
cDNA for thrombospondin-4 that was cloned into the EcoRI site 
of pLEN-PT. 

The final form of each construct is moved from M13mp8 to 
the mammalian expression vector pLEN-PT using Xbal sites. 
This vector was constructed by Drs. Paul Johnson and Richard 
Hynes by cloning the polyl inker from the pECE vector into the 
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BamHI site of pLEN (California Biotechnology Inc., Mountain 
View, CA) . 

Expression of the inserted DNA is driven by the human 
metallothionein II promoter. A mixture of the construct 
(5-10 pg) and neomycin resistance-containing plasmid 
pSV2neo (0.5-1.0 ug) is transfected into NIH 3T3 mouse 
fibroblast cells using the Lipofectin (Bethesda Research 
Laboratories, Gaithersburg, MD) protocol. 

The cells are grown in 100-mm dishes until they are 
approximately 50% confluent. The cells are washed once with 
3 mL of OptiMEMI reduced serum medium (Gibco Laboratories, 
Gaithersburg, M.D.) containing no serum, and then 3 mL of the 
same medium is placed in the dish. The DNA-Lipof ectin 
mixture is added to the dishes with continuous swirling. 
After 24 h, the medium is changed to DME containing 10% FBS . 
After 48 h, the cells were trypsinized and replated in DME 
containing 10% FBS and 1 mg/mL Geneticin (G418, Gibco 
Laboratories). After approximately 10 days, individual 
G418-resistant colonies are subcloned, or the cells allowed 
to grow and handled as pools of G418-resistant clones. To 
produce culture supernatants for analysis, the cells are 
grown to confluence in four T75 flasks. Fresh medium is 
placed on the cells, and the cells are grown for 48 h. The 
conditioned medium is removed, and DFP added to 1 mM and PMSF 
added to 5 mM. After several hours at 0°C, the culture 
supernatants are frozen and stored at -20 °C. 

EXAMPLE 5 : Antibody Production 

A. Preparation of of Fusion Proteins 

The specific methodology for construction of the fusion 
proteins varies depending upon the availability of 
restriction endonuclease sites. In general, endonuclease 
sites are chosen in close proximity to the region of cDNA of 
interest. The insert is purified by preparative agarose gel 
electrophoresis. The insert is isolated from the cut out 
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band by the glass bead method of Vogelstein and Gillespie, 
Proc. N at. Acad. Sci. USA 76:615-619 (1979) or by 
electroelution by standard procedures recommended by the 
supplier (CBS Plastics). The insert is blunted and the 
appropriate EcoRI linker is added so that the reading frame 
of the insert is the same as that of the B-galactosidase 
gene. The insert is cut with EcoRI and ligated into Xgtll 
by procedures recommended by the supplier (Promega Biotec, 
Protoclone Xgtll System). Lysogens of the Y1089 strain are 
selected by their ability to grow at 30 °C but not at 42°C. 

To prepare fusion protein, an overnight grow at 30 °C is 
diluted 1:10 (v/v) and grown for an additional hour at 30 °C. 
The culture is incubated at 45°C for 15 minutes and 10 
pg/ml of isopropyl B-D-thiogalactopyranoside is added. The 
cultures are incubated for 1 to 2 hours at 37°C. The cells 
are pelleted by centr if ugation and resuspended in 100 mM Tris 
(pH 8.0), 0.25 M NaCl and 0.2 mg/ml lysozyme (Sigma). After 
30 minutes at 0°C, the sample is rapidly frozen and thawed 
twice and then sonicated to disrupt the cells. The sample is 
centrifuged and the supernatant is applied to an 
anti-beta-galactosidase antibody affinity column (Promega 
Biotec, Protosorb, lacZ Immuno Affinity Adsorbent). The 
bound fusion protein is eluted with 0.1 M NaHC0 3 /Na 2 C0 3 
(pH 10.8) and dialyzed to neutral pH. 

Alternately, a glutathione S-transf erase fusion protein 
is used as an antigen to raise a polyclonal rabbit 
anti-Xenopus laevis thrornbospondin-4 antibody. An 
approximately 1 . 2 Kb BamHI fragment of one of the Xenopus 
clones (XF3) is cloned into the bacterial expression vector 
pGEX-2T (Pharmacia) . The fusion protein is expressed and 
purified according to established procedures (Current 
Protocols in Molecular Biology, John Wiley and Sons). The 
fusion protein is still bound to glutathione-agarose beads 
when it is used as an antigen. 

The antibody to human thrombospondin-4 can be produced by 
preparing a peptide fragment of human thrombospondin-4 
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believed to be immunogenic. A preferred sequence is the 
sequence of the last 14 amino acids that is predicted from 
the cDNA sequence of SEQ ID NO.: 10 (FQEFQTQNFDRFDN) . This 
peptide is synthesized, purified, coupled to a carrier and 
used to produce a polyclonal antiserum in rabbits using well 
known methods . 

B . Production of Anti-Fusion Protein Antibodies 

Polyclonal rabbit antisera is produced in New Zealand 
White rabbits by subcutaneous injections at multiple sites of 
purified fusion proteins, emulsified with an equal volume of 
Freund's complete adjuvant. The rabbits will receive a 
subcutaneous booster injection after 4-6 weeks of purified 
antigen emulsified in Freund's incomplete adjuvant and are 
boosted once each month until a good titre of antibody is 
obtained. Rabbits are bled 10 days after boosting. 

EXAMPLE 6: Preparation of Constructs for Transf ections and 
Microinjections 

Methods for purification of DNA for microinjection are 
well known to those of ordinary skill in the art See, for 
example, Hogan et al . , Manipulating the Mouse Embryo , Cold 
Spring Harbor Laboratory, Cold Spring Harbor, NY (1986); and 
Palmer et al. , Nature , 300: 611 (1982). 

Construction of Transgenic Animals 

A variety of methods are available for the production of 
transgenic animals associated with this invention. DNA can 
be injected into the pronucleus of a fertilized egg before 
fusion of the male and female pronuclei, or injected into the 
nucleus of an embryonic cell (e.g., the nucleus of a two-cell 
embryo) following the initiation of cell division (Brinster 
et al,, Proc. Nat. Acad. Sci. USA , 82: 4438-4442 (1985)). 
Embryos can be infected with viruses, especially 
retroviruses, modified to bear thrombospondin-4 genes of the 
invention . 
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Pluripotent stem cells derived from the inner cell mass 
of the embryo and stabilized in culture can be manipulated in 
culture to incorporate thrombospondin-4 genes of the 
invention. A transgenic animal can be produced from such 
cells through implantation into a blastocyst that is 
implanted into a foster mother and allowed to come to term. 

Animals suitable for transgenic experiments can be 
obtained from standard commercial sources such as Charles 
River (Wilmington, MA), Taconic (Germantown, NY), Harlan 
Sprague Dawley (Indianapolis, IN), etc. Swiss Webster female 
mice are preferred for embryo retrieval and transfer. 
B6D2F 1 males can be used for mating and vasectomized Swiss 
Webster studs can be used to stimulate pseudopregnancy . 
Vasectomized mice and rats can be obtained from the supplier. 

Microinjection Procedures 

The procedures for manipulation of the rodent embryo and 
for microinjection of DNA into the pronucleus of the zygote 
are well known to those of ordinary skill in the art (Hogan 
et al . , supra ) . Microinjection procedures for fish, 
amphibian eggs and birds are detailed in Houdebine and 
Chourrout, Experientia , 47: 897-905 (1991). Other procedures 
for introduction of DNA into tissues of animals are described 
in U.S. Patent No., 4,945,050 (Sandford et al. , July 30, 
1990) . 

Transgenic Mice 

Female mice six weeks of age are induced to superovulate 
with a 5 IU injection (0.1 cc, ip) of pregnant mare serum 
gonadotropin (PMSG; Sigma) followed 48 hours later by a 5 IU 
injection (0.1 cc, ip) of human chorionic gonadotropin (hCG; 
Sigma) . Females are placed with males immediately after hCG 
injection. Twenty-one hours after hCG, the mated females are 
sacrificed by C0 2 asphyxiation or cervical dislocation and 
embryos are recovered from excised oviducts and placed in 
Dulbecco's phosphate buffered saline (DPSS) with 0.5% bovine 
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serum albumin (BSA; Sigma). Surrounding cumulus cells are 
removed with hyaluronidase (1 mg/ml). Pronuclear embryos are 
then washed and placed in Earle's balanced salt solution 
containing 0.5% BSA (EBSS) in a 37.5°C incubator with a 
humidified atmosphere at 5% C0 2 , 95% air until the time of 
injection . 

Randomly cycling adult female mice are paired with 
vasectomized males. Swiss Webster or other comparable strains 
can be used for this purpose. Recipient females are mated at 
the same time as donor females. At the time of embryo 
transfer, the recipient females are anesthetized with an 
intraperitoneal injection of 0.015 ml of 2.5% avertin per 
gram of body weight. The oviducts are exposed by a single 
midline dorsal incision. An incision is then made through 
the body wall directly over the oviduct. The ovarian bursa 
is then torn with watchmakers forceps. Embryos to be 
transferred are placed in DPBS and in the tip of a transfer 
pipet (about 10-12 embryos). The pipet tip is inserted into 
the infundibulum and the embryos transferred. After the 
transfer, the incision is closed by two sutures. 

Transgenic Rats 

The procedure for generating transgenic rats is similar 
to that of mice See Hammer et al . , Cell , 63:1099-1112 
(1990). Thirty day-old female rats are given a subcutaneous 
injection of 20 IU of PMSG (0.1 cc) and 48 hours later each 
female placed with a proven male. At the same time, 40-80 
day old females are placed in cages with vasectomized males. 
These will provide the foster mothers for embryo transfer. 
The next morning females are checked for vaginal plugs. 
Females who have mated with vasectomized males are held aside 
until the time of transfer. Donor females that have mated 
are sacrificed (C0 2 asphyxiation) and their oviducts 
removed, placed in DPSS with 0.5% BSA and the embryos 
collected. Cumulus cells surrounding the embryos are removed 
with hyaluronidase (1 mg/ml). The embryos are then washed 
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and placed in EBSS (Earle's balanced salt solution) 
containing 0.5% BSA in a 37.5°C incubator until the time of 
microinjection . 

Once the embryos are injected, the live embryos are moved 
to DPBS for transfer into foster mothers. The foster mothers 
are anesthetized with ketamine (40 mg/kg, ip) and xylazine (5 
mg/kg, ip) . A dorsal midline incision is made through the 
skin and the ovary and oviduct are exposed by an incision 
through the muscle layer directly over the ovary. The 
ovarian bursa is torn, the embryos are picked up into the 
transfer pipet, and the tip of the transfer pipet is inserted 
into the infundibulum. Approximately 10-12 embryos are 
transferred into each rat oviduct through the infundibulum. 
The incision is then closed with sutures, and the foster 
mothers are housed singly. 

Embryonic Stem (ES) Cell Methods 

Introduction of DNA into ES cells: 

Methods for the culturing of ES cells and the subsequent 
production of transgenic animals by the introduction of DNA 
into ES cells using methods such as electroporation, calcium 
phosphate/DNA precipitation; and direct injection are well 
known to those of ordinary skill in the art. See, for 
example, Teratocarcinomas and Embryonic Stem Cells, A 
Practical Approach , E.J. Robertson, ed., IRL Press (1987). 
Selection of the desired clone of thrombospondin-4-containing 
ES cells is accomplished through one of several means. 
Although embryonic stem cells are currently available for 
mice only, it is expected that similar methods and procedures 
as described and cited here will be effective for embryonic 
stem cells from different species as they become available. 

In cases involving random gene integration, a clone 
containing the thrombospondin-4 gene of the invention is 
co-transf ected with a gene encoding neomycin resistance. 
Alternatively, the gene encoding neomycin resistance is 
physically linked to the thrombospondin-4 gene. Transfection 
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is carried out by any one of several methods well known to 
those of ordinary skill in the art (E.J. Robertson, supra ) . 
Calcium phosphate/DNA precipitation, direct injection, and 
electroporation are the preferred methods. Following DNA 
introduction, cells are fed with selection medium containing 
10% fetal bovine serum in DMEM supplemented with G418 
(between 200 and 5Q0\x g/ml biological weight). Colonies 
of cells resistant to G418 are isolated using cloning rings 
and expanded. DNA is extracted from drug resistant clones 
and Southern blotting experiments using a transgene-specif ic 
DNA probe are used to identify those clones carrying the 
thrombospondin-4 sequences. In some experiments, PCR methods 
are used to identify the clones of interest. 

DNA molecules introduced into ES cells can also be 
integrated into the chromosome through the process of 
homologous recombination. Copecchi , Science , 244: 1288-1292 
(1989). Direct injection results in a high efficiency of 
integration. Desired clones are identified through PCR of 
DNA prepared from pools of injected ES cells. Positive cells 
within the pools are identified by PCR subsequent to cell 
cloning. DNA introduction by electroporation is less 
efficient and requires a selection step. Methods for 
positive selection of the recombination event ( i.e. , neo 
resistance) and dual positive-negative selection ( i.e. , neo 
resistance and gancyclovir resistance) and the subsequent 
identification of the desired clones by PCR have been 
described by Copecchi, supra and Joyner et al ■ , Nature , 338: 
153-156 (1989), the disclosures of which are incorporated 
herein. 

Embryo Recovery and ES Cell Injection: 

Naturally cycling or superovulated female mice mated with 
males are used to harvest embryos for the implantation of ES 
cells. It is desirable to use the C57BL165 strain for this 
purpose when using mice. Embryos of the appropriate age are 
recovered approximately 3.5 days after successful mating. 



WO 94/13794 



PCT/US93/11725 



-39- 

Mated females are sacrificed by C0 2 asphyxiation or 
cervical dislocation and embryos are flushed from excised 
uterine horns and placed in Dulbecco's modified essential 
medium plus 10% calf serum for injection with ES cells. 
Approximately 10-20 ES cells are injected into blastocysts 
using a glass microneedle with an internal diameter of 
approximately 20 vim. 

Transfer of Embryos to Receptive Females: 
Randomly cycling adult female mice are paired with 
vasectomized males. Mouse strains such as Swiss Webster, ICR 
or others can be used for this purpose. Recipient females 
are mated such that they will be at 2.5 to 3.5 days 
post-mating when required for implantation with blastocysts 
containing ES cells. At the time of embryo transfer, the 
recipient females are anesthetized with an intraperitoneal 
injection of 0.015 ml of 2.5% avertin per gram of body 
weight. The ovaries are exposed by making an incision in the 
body wall directly over the oviduct and the ovary and uterus 
are externalized. A hole is made in the uterine horn with a 
25 gauge needle through which the blastocysts are 
transferred. After the transfer, the ovary and uterus are 
pushed back into the body and the incision is closed by two 
sutures. This procedure is repeated on the opposite side if 
additional transfers are to be made. 

Identification of Transgenic Mice and Rats 

Tail samples (1-2 cm) are removed from three week old 
animals. DNA is prepared and analyzed by Southern blot or 
PCR to detect transgenic founder (F Q ) animals and their 
progeny (F 1 and F 2 ) . In this way, animals that have 
become transgenic for the desired thrombospondin-4 genes are 
identified. Because not every transgenic animal expresses 
the thrombospondin-4 gene, and not all of those that do will 
have the expression pattern anticipated by the experimenter, 
it is necessary to characterize each line of transgenic 
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animals with regard to expression of the thrombospondin-4 in 
different tissues. 

Production of Non-Rodent Transgenic Animals 

Procedures for the production of non-rodent mammals and 
other animals have been discussed by others. See Houdebine 
and Chourrout, supra ; Pursel et al . , Science 244: 1281-1288 
(1989); and Simms et al., Bio/Technology , 6: 179-183 (1988). 

Identification of Other Transgenic Organisms 

An organism is identified as a potential transgenic by 
taking a sample of the organism for DNA extraction and 
hybridization analysis with a probe complementary to the 
thrombospondin-4 gene of interest. Alternatively, DNA 
extracted from the organism can be subjected to PCR analysis 
using PCR primers complementary to the thrombospondin-4 gene 
of interest. 

Example 6 : Protocol for Inactivating the Thrombospondin-4 
Gene 

Mouse genomic clones are isolated by screening a genomic 
library from the D3 strain of mouse with a Xenopus 
thrombospondin-4 probe. Duplicate lifts are hybridized with 
a radiolabeled probe by established protocols (Sambrook, J. 
et al ., The Cloning Manual , Cold Spring Harbor Press, N.Y.). 
Plaques that correspond to positive signal on both lifts are 
isolated and purified by successive screening rounds at 
decreasing plaque density. The validity of the isolated 
clones is confirmed by nucleotide sequencing. 

The genomic clones are used to prepare a gene targeting 
vector for the deletion of thrombospondin-4 in embryonic stem 
cells by homologous recombination. A neomycin resistance 
gene ( neo ) with its transcriptional and translational 
signals, is cloned into convenient sites that are near the 5' 
end of the gene. This will disrupt the coding sequence of 
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thrombospondin-4 and allow for selection by the drug 
Geneticin (G418) by embryonic stem (ES) cells transfected 
with the vector. The Herpes simplex virus thymidine kinase 
(HSV-tk) gene is placed at the other end of the genomic DNA 
as a second selectable marker. Only stem cells with the neo 
gene will grow in the presence of this drug. 

Random integration of this construct into the ES genome 
will occur via sequences at the ends of the construct. In 
these cell lines, the HSV-tk gene will be functional and the 
drug gancyclovir will therefore be cytotoxic to cells having 
an integrated sequence of the mutated thrombospondin-4 coding 
sequence . 

Homologous recombination will also take place between 
homologous DNA sequences of the ES thrombospondin-4 genome 
and the targeting vector. This usually results in the 
excision of the HSV-tk gene because it is not homologous with 
the thrombospondin-4 gene. 

Thus, by growing the transfected ES cells in G418 and 
gancyclovir, the cell lines in which homologous recombination 
has occurred will be highly enriched. These cells will 
contain a disrupted coding sequence of thrombospondin-4. 
Individual clones are isolated and grown up to produce enough 
cells for frozen stocks and for preparation of DNA. Clones 
in which the thrombospondin-4 gene has been successfully 
targeted are identified by Southern blot analysis. The final 
phase of the procedure is to inject targeted ES cells into 
blastocysts and to transfer the blastocysts into 
pseudopregnant females. The resulting chimeric animals are 
bred and the offspring are analyzed by Southern blotting to 
identify individuals that carry the mutated form of the gene 
in the germ line. These animals will be mated to determine 
the effect of thrombospondin-4 deficiency on murine 
development and physiology. 

It should be understood that the preceding is merely a 
detailed description of certain preferred embodiments. It 
therefore should be apparent to those skilled in the art that 
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various modifications and equivalents can be made without 
departing from the spirit or scope of the invention. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANT: 

(A) NAME: BRIGHAM AND WOMEN'S HOSPITAL, INC. 

(B) STREET: 75 Francis Street 

(C) CITY: Boston 

(D) STATE: Massachusetts 

(E) COUNTRY : United States of America 

(F) ZIP: 02115 

(G) TELEPHONE: 617-732-5504 

(H) TELEFAX: 617-732-5343 



(ii) TITLE OF INVENTION: HUMAN THROMBOSPONDIN-4 



(iii) NUMBER OF SEQUENCES: 8 



(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Wolf, Greenfield, & Sacks, P.C. 

(B) STREET: 600 Atlantic Avenue 

(C) CITY: Boston 

(D) STATE: Massachusetts 

(E) COUNTRY: United States of America 

(F) ZIP: 02210 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3 1/2 inch 

(B) COMPUTER: IBM-compatible 

(C) OPERATING SYSTEM: MS-DOS Version 3.3 

(D) SOFTWARE: WordPerfect 5.1 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: not available 

(B) FILING DATE: filed herewith 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 07/985,296 

(B) FILING DATE: 04-DEC-1992 



(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: GATES, Edward R. 

(B) REGISTRATION NUMBER: 31,616 

(C) REFERENCE /DOCKET NUMBER: B0801/7005WO 

(2) INFORMATION FOR SEQ ID NO: 1: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2820 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(iv) ANTI-SENSE: no 

(vii) ORIGINAL SOURCE 

(A) ORGANISM: Xenopus laevis 

(D) DEVELOPMENTAL STAGE: Stage 45 (germ line) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 

CAGCCCAAGT CCACAGTTAC GCTCTTTGGA CTTTATTCCA CCAGTGACAA CAGCAGGTTC 60 
TTTGAATTCA CAGTTATGGG TCGTTTAAAC AAAGCCTCTT TACGATACCT CCGGAGTGAT 120 
GGG AAGTTAC ACTCAGTCTT CTTTAATAAG CTTGACATAG CTGATGGGAA GCAGCACGCG 180 
CTTCTGTTGC ACCTGAGCGG CTTACACCGG GGCGCAACGT TTGCAAAGCT CTACATAGAC 24 0 
TGTAATCCGA CAGGTGTTGT TGAAGATCTA CCCCGGCCGT TATCAGGGAT AAGGCTCAAC 300 
ACAGGGTCTG TGCACTTAAG AACACTACAG AAAAAGGGAC AGGATTCCAT GGATGAATTA 3 60 
AAACTGGTAA TGGGAGGCAC TCTGTCCGAG GTAGGAGCAA TACAAGAATG TTTTATGCAG 42 0 
AAAAGTGAAG CCGGACAGCA GACAGGTGAC GTCAGCAGAC AGTTGATTGG CCAGATAACC 4 80 
CAAATGAATC AGATGCTGGG AGAGCTCCGA GATGTCATGA GACAGCAGGT GAAAGAGACC 54 0 
ATGTTCTTGA GAAACACCAT TGCAGAATGC CAGGCCTGTG GCTTAGGTCC TGACTTCCCA 600 
TTGCCAACCA AAGTTCCCCA GCGCCTAGCC ACCACTACAC CTCCAAAGCC TCGATGTGAT 660 
GCAACTTCAT GTTTCAGAGG AGTGCGGTGC ATTGATACAG AGGGCGGCTT CCAATGTGGG 7 20 
CCGTGTCCTG AAGGCTATAC AGGCAACGGT GTCATTTGTA CTGATGTGGA TGAGTGTCGG 780 
TTGAATCCAT GTTTCCTTGG TGTACGTTGC ATA A AC AC TT CTCCGGGTTT CAAATGTGAG 840 
AGCTGCCCTC CCGGGTACAC TGGATCCACA ATTCAAGGGA TTGGCATTAA CTTTGCCAAG 900 
CAAAATAAGC AGGTTTGCAC AGATACCAAT GAATGTGAAA ATGGAAGAAA TGGAGGGTGT 960 
ACATCCAATT CTCTTTGCAT CAATACGATG GGATCTTTCC GCTGTGGGGG CTGCAAACCT 102 0 
GGTTATGTCG GGGATCAAAT AAAAGGCTGC AAACCTGAAA AAAGCTGCCG TCATGGACAG 1080 
AATCCGTGTC ATGCAAGTGC TCAGTGTTCA GAGGAAAAGG ACGGTGACGT AACCTGCACT 1140 
TGTTCAGTCG GTTGGGCCGG CAATGGCTAC CTCTGTGGCA AAGATACTGA TATTGATGGC 1200 
TACCCGGATG AAGCCCTGCC ATGTCCAGAT AAGAACTGCA AAAAGGACAA CTGTGTATAT 12 60 
GTTCCTAACT CGGGTCAAGA AGACACTGAT AA AG AT A AC A TTGGAGATGC TTGTGATGAA 1320 
GATGCGGATG GAGATGGTAT CCTAAATGAG CAGGACAACT GTGTGCTGGC TGCCAACATC 13 80 
GATCAGAAAA ACAGTGACCA AGATATATTT GGGGACGCCT GTGACAACTG CCGCTTAACC 1440 
CTCAACAATG ACCAAAGGGA CACAGACAAT GACGGGAAAG GAGATGCTTG TGACGATGAC 1500 
ATGGATGGAG ATGGCATCAA GAATATCTTG GATAACTGCC AGAGAGTTCC CAATGTGGAC 1560 
CAGAAAGACA AAGATGGAGA TGGAGTTGGT GATATATGTG ACAGCTGTCC TGACATCATA 162 0 
AATCCAAACC AGTCAGACAT TGACAATGAC CTTGTTGGAG ATTCCTGTGA TACTAACCAA 1680 
GACAGCGATG GTGATGGTCA CCAGGACAGC ACAGACAACT GCCCCACAGT GATAAACAGC 174 0 
AACCAGCTCG ACACAGACAA GG ACGGCATC GGAGATGAAT GTGACGATGA TGATGATAAC 1800 
GATGGAATCC CGGATACTGT TCCTCCCGGA CCTGATAACT GTAAACTGGT TCCCAACCCA 1860 
GGGCAGGAGG ATGACAACAA TGATGGAGTC GGAGACGTCT GTGAGGCCGA TTTTGACCAG 19 2 0 
GACACGGTCA TTGACCGAAT TGACGTTTGC CCTGAAAATG CAGAGATCAC CCTGACAGAT 19 80 
TTCAGAGCTT ATCAAACTGT AGTTCTGGAT CCCGAAGGAG ATGCCCAAAT TGATCCAAAC 2040 
TGGATTGTTT TGAACCAGGG AATGGAGATT GTGCAGACGA TGAACAGTGA CCCTGGACTG 2100 
GCAGTTGGTT ACACAGCATT TAATGGAGTT GATTTCGAGG GCACATTCCA CGTGAACACC 2160 
ATGACGGATG ATGATTACGC TGGTTTCATC TTTGGTTATC AGGACAGTTC AAGCTTTTAT 22 2 0 
GTGGTGATGT GGAAGCAGAC TGAGCAGACT TACTGGCAGG CAACCCCCTT CAGAGCAGTT 22 80 
GCAGAGCCTG GAATCCAACT GAAGGCTGTG AAATCCAAGT CAGGACCCGG GGAACATCTG 2 340 
AGGAACGCTC TGTGGCACAC AGGAGACACC AATGATCAAG TGAGGCTGCT CTGGAAAGAC 2400 
CCCAGGAATG TCGGCTGGAA AGACAAAGTC TCCTACCGCT GGTTCTTACA GCACAGGCCA 2460 
CAAGTCGGCT ACATCAGAGC CAGATTTTAT GAAGGCACCG AGCTGGTGGC TGACTCTGGA 2 520 
GTCACTGTGG ACACCACCAT GCGAGGAGGA AGACTGGGAG TATTCTGCTT TTCACAGGAA 2 580 
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AACATAATTT GGTCCAATCT GAAATACCGG 
GCATTTCAAG CACAACAGTT TTCCAGTTAA 
TTTTGTGATT TTTTTTTTGT AGTAATATGA 
CTACCAACTG TACAATAATG TCTGTAAAAT 



TGTAATGATA CAATCCCAGA GGATTTCCAG 2 640 

ACAGAACCCA CACAATATCC GGTGATTTTT 2700 

GAAAACGTTA TTTTCATGCA GCCTTGTTTT 27 60 

AAAATGGATA CAAAAATGAG AAAAAAAAAA 2 820 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 889 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(iii) HYPOTHETICAL: yes 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 



Gin 


Pro 


Lys 


Ser 


Thr 


Val 


Thr 


Leu 


Phe 


Gly 


Leu 


Tyr 


Ser 


Thr 


Ser 


Asp 


1 








5 










10 










15 




Asn 


Ser 


Arg 


Phe 


Phe 


Glu 


Phe 


Thr 


Val 


Met 


Gly 


Arg 


Leu 


Asn 


Lys 


Ala 








20 










25 










30 






Ser 


Leu 


Arg 


Tyr 


Leu 


Arg 


Ser 


Asp 


Gly 


Lys 


Leu 


His 


Ser 


Val 


Phe 


Phe 






35 










40 










45 








Asn 


Lys 


Leu 


Asp 


He 


Ala 


Asp 


Gly 


Lys 


Gin 


His 


Ala 


Leu 


Leu 


Leu 


His 




50 










55 










60 










Leu 


Ser 


Gly 


Leu 


His 


Arg 


Gly 


Ala 


Thr 


Phe 


Ala 


Lys 


Leu 


Tyr 


He 


Asp 


65 










70 










75 










80 


Cys 


Asn 


Pro 


Thr 


Gly 


Val 


Val 


Glu 


Asp 


Leu 


Pro 


Arg 


Pro 


Leu 


Ser 


Gly 










85 










90 










95 




He 


Arg 


Leu 


Asn 


Thr 


Gly 


Ser 


Val 


His 


Leu 


Arg 


Thr 


Leu 


Gin 


Lys 


Lys 








100 










105 










110 






Gly 


Gin 


Asp 


Ser 


Met 


Asp 


Glu 


Leu Lys 


Leu 


Val 


Met 


Gly 


Gly 


Thr 


Leu 






115 










120 










125 








Ser 


Glu 


Val 


Gly 


Ala 


He 


Gin 


Glu 


Cys 


Phe 


Met 


Gin 


Lys 


Ser 


Glu 


Ala 




130 










135 










140 










Gly 


Gin 


Gin 


Thr 


Gly 


Asp 


Val 


Ser 


Arg 


Gin 


Leu 


He 


Gly 


Gin 


lie 


Thr 


145 










150 










155 










160 


Gin 


Met 


Asn 


Gin 


Met 


Leu 


Gly 


Glu 


Leu 


Arg 


Asp 


Val 


Met 


Arg 


Gin 


Gin 










165 










170 










175 




Val 


Lys 


Glu 


Thr 


Met 


Phe 


Leu 


Arg 


Asn 


Thr 


He 


Ala 


Glu 


Cys 


Gin 


Ala 








180 










185 










190 






Cys 


Gly 


Leu 


Gly 


Pro 


Asp 


Phe 


Pro 


Leu 


Pro 


Thr 


Lys 


Val 


Pro 


Gin 


Arg 






195 










200 










205 








Leu 


Ala 


Thr 


Thr 


Thr 


Pro 


Pro 


Lys 


Pro 


Arg 


Cys 


Asp 


Ala 


Thr 


Ser 


Cys 




210 










215 










220 










Phe 


Arg 


Gly 


Val 


Arg 


Cys 


He 


Asp 


Thr 


Glu 


Gly 


Gly 


Phe 


Gin 


Cys 


Gly 


225 










230 










235 










240 


Pro 


Cys 


Pro 


Glu 


Gly 


Tyr 


Thr 


Gly 


Asn 


Gly Val 


lie 


Cys 


Thr 


Asp 


Val 










245 










250 










255 




Asp 


Glu 


Cys 


Arg 


Leu 


Asn 


Pro 


Cys 


Phe 


Leu 


Gly 


Val 


Arg 


Cys 


lie 


Asn 



260 265 270 
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Thr 


Ser 


Pro 








275 




Ser 


Thr 


He 


Gin 




290 






Val 


Cys 


Thr 


Asp 


305 








Thr 


Ser 


Asn 


Ser 


Gly 


Cys 


Lys 


Pro 








340 


Glu 


Lys 


Ser 


Cys 






355 




Cys 


Ser 


Glu 


Glu 




370 






Trp 


Ala Gly 


Asn 


385 








Tyr 


Pro 


Asp 


Glu 


Asn 


Cys 


Val 


Tyr 








420 


Asn 


lie 


Gly 


Asp 






435 




Asn 


Glu 


Gin 


Asp 




450 






Ser 


Asp 


Gin 


Asp 


465 








Leu 


Asn 


Asn 


Asp 


Cys 


Asp 


Asp 


Asp 








500 


Cys 


Gin 


Arg 


Val 






515 




Val 


Gly Asp 


He 




530 






Ser 


Asp 


He 


Asp 


545 








Asp 


Ser 


Asp 


Gly 


Val 


He 


Asn 


Ser 








580 


Glu 


Cys 


ASp 


Asp 






595 




Pro 


Gly 


Pro 


Asp 




610 






Asp 


Asn 


Asn 


Asp 


625 








Asp 


Thr 


Val 


lie 


Thr 


Leu 


Thr 


Asp 








660 


Gly 


Asp 


Ala 


Gin 






675 




Glu 


He 


Val 


Gin 




690 







Phe Lys Cys Glu 
280 

Gly He Gly He 
295 

Thr Asn Glu Cys 
310 

Leu Cys He Asn 
325 

Gly Tyr Val Gly 

Arg His Gly Gin 
360 

Lys Asp Gly Asp 
375 

Gly Tyr Leu Cys 
390 

Ala Leu Pro Cys 
405 

Val Pro Asn Ser 

Ala Cys Asp Glu 
440 

Asn Cys Val Leu 
455 

He Phe Gly Asp 
470 

Gin Arg Asp Thr 
485 

Met Asp Gly Asp 

Pro Asn Val Asp 
520 

Cys Asp Ser Cys 
535 

Asn Asp Leu Val 
550 

Asp Gly His Gin 
565 

Asn Gin Leu Asp 

Asp Asp Asp Asn 
600 

Asn Cys Lys Leu 
615 

Gly Val Gly Asp 
630 

Asp Arg He Asp 
645 

Phe Arg Ala Tyr 

He Asp Pro Asn 
680 

Thr Met Asn Ser 
695 
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Ser Cys Pro Pro 

Asn Phe Ala Lys 
300 

Glu Asn Gly Arg 
315 

Thr Met Gly Ser 
330 

Asp Gin He Lys 
345 

Asn Pro Cys His 

Val Thr Cys Thr 
380 

Gly Lys Asp Thr 
395 

Pro Asp Lys Asn 
410 

Gly Gin Glu Asp 
425 

Asp Ala Asp Gly 

Ala Ala Asn He 
460 

Ala Cys Asp Asn 
475 

Asp Asn Asp Gly 
490 

Gly lie Lys Asn 
505 

Gin Lys Asp Lys 

Pro Asp lie lie 
540 

Gly Asp Ser Cys 
555 

Asp Ser Thr Asp 
570 

Thr Asp Lys Asp 
585 

Asp Gly lie Pro 

Val Pro Asn Pro 
620 

Val Cys Glu Ala 
635 

Val Cys Pro Glu 
650 

Gin Thr Val Val 
665 

Trp He Val Leu 

Asp Pro Gly Leu 
700 



Gly Tyr Thr Gly 
285 

Gin Asn Lys Gin 

Asn Gly Gly Cys 
320 

Phe Arg Cys Gly 
335 

Gly Cys Lys Pro 
350 

Ala Ser Ala Gin 
365 

Cys Ser Val Gly 

Asp lie Asp Gly 
400 

Cys Lys Lys Asp 
415 

Thr Asp Lys Asp 
430 

Asp Gly lie Leu 
445 

Asp Gin Lys Asn 

Cys Arg Leu Thr 
480 

Lys Gly Asp Ala 
495 

He Leu Asp Asn 
510 

Asp Gly Asp Gly 
525 

Asn Pro Asn Gin 

Asp Thr Asn Gin 
560 

Asn Cys Pro Thr 
575 

Gly He Gly Asp 
590 

Asp Thr Val Pro 
605 

Gly Gin Glu Asp 

Asp Phe Asp Gin 
640 

Asn Ala Glu lie 
655 

Leu Asp Pro Glu 
670 

Asn Gin Gly Met 
685 

Ala Val Gly Tyr 
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Thr 


Ala 


Phe 


Asn 


Gly 


Val 


Asp 


Phe 


Glu 


Gly 


Thr 


irne 


rll S 


val 


Asn 


inr 


705 










710 










715 










Ton 


Met 


Thr 


Asp 


Asp 


Asp 


Tyr 


Ala 


Gly 


Phe 


lie 


Phe 


bly 


Tyr 


bin 


Asp 


Car 










725 










7 30 










/ i D 




Ser 


Ser 


Phe 


Tyr 


Val 


Val 


Met 


Trp 


Lys 


bin 


inr 


Pill 


Gin 


Thr 


Tyr 


Trp 








740 










745 










750 






Gin 


Ala 


Thr 


Pro 


Phe 


Arg 


Ala 


Val 


Ala 


blU 


fro 


biy 


He 


Gin 


Leu 


uy s 






755 










7 60 










765 








Ala 


Val 


Lys 


Ser 


Lys 


ber 


bly 


Pro 


bly 


blU 


rll S 


Leu 


Arg 


Asn 


Ala 


Leu 




*7 "7 r\ 

1 7 0 










I/O 










7 ft n 










Trp 


nlS 


inr 


O xy 


Asp 


XII L 


Asn 


Asp 


VJ X 11 


vai 




Le u 


Leu 


Trp 


Lys 


Asp 


"7 Q C 
/ O D 










7 q n 










7 Q R 










800 


Pro 


Arg 


As n 


Val 


<j xy 


irp 




Asp 




Val 


Cpr 
OC 1 


xy i 


Arg Trp 


Phe 


Leu 










one 
oUD 










OiU 










O X J 




Gin 


His 


Arg 


Pro 


bin 


vai 




Tyr 


T T d 

lie 


Arg 


a j. a 


Arg 


Phe 


Tyr 


u 1 U. 


m v 

oiy 








820 










825 










830 






Thr 


Glu 


Leu 


Val 


Ala 


Asp 


Ser 


Gly 


Val 


Thr 


Val 


Asp 


Thr 


Thr 


Met 


Arg 






835 










840 










845 








Gly 


Gly 


Arg 


Leu 


Gly 


Val 


Phe 


Cys 


Phe 


Ser 


Gin 


Glu 


Asn 


He 


He 


Trp 




850 










855 










860 










Ser 


Asn 


Leu 


Lys 


Tyr 


Arg 


Cys 


Asn 


Asp 


Thr 


He 


Pro 


Glu 


Asp 


Phe 


Gin 


865 










870 










875 










880 


Ala 


Phe 


Gin 


Ala 


Gin 


Gin 


Phe 


Ser 


Ser 
























885 
























(2) 


INFORMATION 


FOR 


SEQ 


ID NO: 


3: 

















(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3074 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iv) ANTI -SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 

GAATTCCGGG GAGCAGGAAG AGCCAACATG CTGGCCCCGC GCGGAGCCGC CGTCCTCCTG 60 

CTGCACCTGG TCCTGCAGCG GTGGCTAGCG GC AGGCGCCC AGGCCACCCC CCAGGTCTTT 120 

GACCTTCTCC CATCTTCCAG TCAGAGGCTA AACCCAGGCG CTCTGCTGCC AGTCCTGACA 180 

GACCCCGCCC TGAATGATCT CTATGTGATT TCCACCTTCA AGCTGCAGAC TAAAAGTTCA 240 

GCCACCATCT TCGGTCTTTA CTCTTCAACT GACAACAGTA AATATTTTGA ATTTACTGTG 3 00 

ATGGGACGCT TAAGCAAAGC CATCCTCCGT TACCTGAAGA ACGATGGGAA GGTGCATTTG 3 60 

GTGGTTTTCA ACAACCTGCA GCTGGCAGAC GGAAGGCGGC ACAGGATCCT CCTGAGGCTG 4 20 

AGCAATTTGC AGCGAGGGGC CGGCTCCCTA GAGCTCTACC TGGACTGCAT CCAGGTGGAT 4 80 

TCCGTTCACA ATCTCCCCAG GGCCTTTGCT GGCCCCTCCC AGAAACCTGA GACCATTGAA 540 

TTGAGGACTT TCCAGAGGAA GCCACAGGAC TTCTTGGAAG AGCTGAAGCT GGTGGTGAGA 600 

GGCTCACTGT TCCAGGTGGC CAGCCTGCAA GACTGCTTCC TGCAGCAGAG TGAGCCACTG 660 

GCTGCCACAG GCACAGGGGA CTTTAACCGG CAGTTCTTGG GTCAAATGAC ACAATTAAAC 7 20 

CAACTCCTGG GAGAGGTGAA GGACCTTCTG AGACAGCAGG TTAAGGAAAC ATCATTTTTG 7 80 
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CGAAACACCA TAGCTGAATG CCAGGCTTGC 
AGCACGGTGG TCGCCCCGGC TCCCCCTGCA 
TCCAACCCAT GTTTCCGAGG TGTCCAATGT 
CCCTGCCCCG AGGGCTACAC AGGAAACGGG 
TACCATCCCT GCTACCCGGG CGTGCACTGC 
GCCTGCCCAG TGGGCTTCAC AGGGCCCATG 
TCAAACAAGC AGGTCTGCAC TGACATTGAT 
TCGATCTGCG TTAATACTTT GGGATCTTAC 
GGTGATCAGA TAAGGGGATG CAAAGTGGAA 
TGCAGTGTGA ATGCCCAGTG CATTGAAGAG 
GTCGGTTGGG CTGGAGATGG CTATATCTGT 
GACGAAGAAC TGCCATGCTC TGCCAGGAAC 
AATTCTGGCC AAGAAGATGC AGACAGAGAT 
GACGGAGATG GGATCCTGAA TGAGCAGGAT 
AGGAACAGCG ATAAAGATAT CTTTGGGGAT 
AACGACCAGA AAGACACCGA TGGGGATGGA 
GGAGATGGAA TAAAAAACAT TCTGGACAAC 
GACAAGGATG GTGATGGTGT GGGGGATGCC 
AACCAGTCTG ATGTGGATAA TGATCTGGTT 
GATGGAGATG GGCACCAGGA CAGCACAGAC 
CTGGACACCG ATAAGGATGG AATTGGTGAC 
ATCCCAGACC TGGTGCCCCC TGGACCAGAC 
GAGGATAGCA ACAGCGACGG AGTGGGAGAC 
GTCATCGATC GGATCGACGT CTGCCCAGAG 
GCTTACCAGA CCGTGGGCCT GGATCCTGAA 
GTCCTGAACC AGGGCATGGA GATTGTACAG 
GGGTACACAG CTTTTAATGG AGTTGACTTC 
GATGATGACT ATGCAGGCTT TATCTTTGGC 
ATGTGGAAGC AGACGGAGCA GACATATTGG 
CCTGGCATTC AGCTCAAGGC TGTGAAGTCT 
TCCCTGTGGC ACACGGGGGA CACCAGTGAC 
AATGTGGGCT GGAAGGACAA GGTGTCCTAC 
GGCTACATCA GGGTACGATT TTATGAAGGC 
ATAGACACCA CAATGCGTGG AGGCCGACTT 
ATCTGGTCCA ACCTCAAGTA TCGCTGCAAT 
CAAACCCAGA ATTTCGACCG CTTCGATAAT 
TCGGAACACT AAAACCATAT ATATTTTAAC 
ATATATCAAA ACGTTTTATG TGAATGTGGC 
AAAAAAAAAA AAAA 



GGTCCTCTCA AGTTTCAGTC TCCGACCCCA 840 
CCGCCAACAC GCCCACCTCG TCGGTGTGAC 900 
ACCGACAGTA GAGATGGCTT CCAGTGTGGG 960 
ATCACCTGTA TTGATGTTGA TGAGTGCAAA 102 0 
ATAAATTTGT CTCCTGGCTT CAGATGTGAC 1080 
GTGCAGGGTG TTGGGATCAG TTTTGCCAAG 1140 
GAGTGTCGAA ATGGAGCGTG CGTTCCCAAC 1200 
CGCTGTGGGC CTTGTAAGCC GGGGTATACT 12 60 
AGAAACTGCA GAAACCCAGA GCTGAACCCT 13 2 0 
AGGCAGGGGG ATGTGACATG TGTGTGTGGA 1380 
GGAAAGGATG TGGACATCGA CAGTTACCCC 1440 
TGTAAAAAGG ACAACTGCAA ATATGTGCCA 1500 
GGCATTGGCG ACGCTTGTGA CGAGGATGCT 1560 
AACTGTGTCC TGATTCATAA TGTGGACCAA 1620 
GCCTGTGATA ACTGCCTGAG TGTCTTAAAT 1680 
AGAGGAGATG CCTGTGATGA TGACATGGAT 1740 
TGCCCAAAAT TTCCCAATCG TGACCAACGG 1800 
TGTGACAGTT GTCCTGATGT CAGCAACCCT 1860 
GGGGACTCCT GTGACACCAA TCAGGACAGT 1920 
AACTGCCCCA CCGTCATTAA CAGTGCCCAG 1980 
GAGTGTGATG ATGATGATGA CAATGATGGT 2040 
AACTGCCGGC TGGTCCCCAA CCCAGCCCAG 2100 
ATCTGTGAGT CTGACTTTGA CCAGGACCAG 2160 
AACGCAGAGG TCACCCTGAC CGACTTCAGG 2220 
GGGGATGCCC AGATCGATCC CAACTGGGTG 2280 
ACCATGAACA GTGATCCTGG CCTGGCAGTG 2 340 
GAAGGGACCT TCCATGTGAA TACCCAGACA 2400 
TACCAAGATA GCTCCAGCTT CTACGTGGTC 2460 
CAAGCCACCC CATTCCGAGC AGTTGCAGAA 2 520 
AAGACAGGTC CAGGGGAGCA TCTCCGGAAC 2580 
CAGGTCAGGC TGCTGTGGAA GGACTCCAGG 2640 
CGCTGGTTCC TACAGCACAG GCCCCAGGTG 2700 
TCTGAGTTGG TGGCTGACTC TGGCGTCACC 27 60 
GGC GTTTTCT GCTTCTCTCA AGAAAACATC 28 20 
GACACCATCC CTGAGGACTT CCAAGAGTTT 2 880 
'TAAACCAAGG AAGCAATCTG TAACTGCTTT 2 940 
TTCAATTTTC TTTAGCTTTT ACCAACCCAA 3000 
AATAAAGGAG AAGAGATCAT TTTTAAAAAA 3060 

3074 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 961 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: yes 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 

Met Leu Ala Pro Arg Gly Ala Ala Val Leu Leu Leu His Leu Val Leu 

1 5 10 15 

Gin Arg Trp Leu Ala Ala Gly Ala Gin Ala Thr Pro Gin Val Phe Asp 

20 25 30 

Leu Leu Pro Ser Ser Ser Gin Arg Leu Asn Pro Gly Ala Leu Leu Pro 

35 40 45 

Val Leu Thr Asp Pro Ala Leu Asn Asp Leu Tyr Val lie Ser Thr Phe 

50 55 60 

Lys Leu Gin Thr Lys Ser Ser Ala Thr He Phe Gly Leu Tyr Ser Ser 
65 70 75 80 

Thr Asp Asn Ser Lys Tyr Phe Glu Phe Thr Val Met Gly Arg Leu Ser 

85 ^ 90 95 

Lys Ala He Leu Arg Tyr Leu Lys Asn Asp Gly Lys Val His Leu Val 

100 105 HO 

Val Phe Asn Asn Leu Gin Leu Ala Asp Gly Arg Arg His Arg He Leu 

115 120 125 

Leu Arg Leu Ser Asn Leu Gin Arg Gly Ala Gly Ser Leu Glu Leu Tyr 

130 135 140 

Leu Asp Cys He Gin Val Asp Ser Val His Asn Leu Pro Arg Ala Phe 
145 150 155 160 

Ala Gly Pro Ser Gin Lys Pro Glu Thr He Glu Leu Arg Thr Phe Gin 

165 170 175 

Arg Lys Pro Gin Asp Phe Leu Glu Glu Leu Lys Leu Val Val Arg Gly 

180 185 190 

Ser Leu Phe Gin Val Ala Ser Leu Gin Asp Cys Phe Leu Gin Gin Ser 

195 200 205 

Glu Pro Leu Ala Ala Thr Gly Thr Gly Asp Phe Asn Arg Gin Phe Leu 

210 215 220 

Gly Gin Met Thr Gin Leu Asn Gin Leu Leu Gly Glu Val Lys Asp Leu 
225 230 235 240 

Leu Arg Gin Gin Val Lys Glu Thr Ser Phe Leu Arg Asn Thr He Ala 

245 250 255 

Glu Cys Gin Ala Cys Gly Pro Leu Lys Phe Gin Ser Pro Thr Pro Ser 

260 265 270 

Thr Val Val Ala Pro Ala Pro Pro Ala Pro Pro Thr Arg Pro Pro Arg 

275 280 285 

Arg Cys Asp Ser Asn Pro Cys Phe Arg Gly Val Gin Cys Thr Asp Ser 

290 295 300 

Arg Asp Gly Phe Gin Cys Gly Pro Cys Pro Glu Gly Tyr Thr Gly Asn 
305 310 315 320 

Gly He Thr Cys He Asp Val Asp Glu Cys Lys Tyr His Pro Cys Tyr 

325 330 335 

Pro Gly Val His Cys He Asn Leu Ser Pro Gly Phe Arg Cys Asp Ala 

340 ~ 345 350 

Cys Pro Val Gly Phe Thr Gly Pro Met Val Gin Gly Val Gly He Ser 

355 360 365 

Phe Ala Lys Ser Asn Lys Gin Val Cys Thr Asp He Asp Glu Cys Arg 

370 375 380 

Asn Gly Ala Cys Val Pro Asn Ser He Cys Val Asn Thr Leu Gly Ser 
385 ~ 390 395 400 

Tyr Arg Cys Gly Pro Cys Lys Pro Gly Tyr Thr Gly Asp Gin lie Arg 
40 5 410 415 
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Gly Cys Lys Val 
420 

Ser Val Asn Ala 
435 

Val Cys Gly Val 
450 

Val Asp lie Asp 
365 

Asn Cys Lys Lys 

Asp Ala Asp Arg 
500 

Gly Asp Gly lie 
515 

Val Asp Gin Arg 
530 

Asn Cys Leu Ser 
545 

Gly Arg Gly Asp 

Asn lie Leu Asp 
580 

Lys Asp Gly Asp 
595 

Ser Asn Pro Asn 
610 

Cys Asp Thr Asn 
62 5 

Asp Asn Cys Pro 

Asp Gly lie Gly 
660 

Pro Asp Leu Val 
675 

Pro Ala Gin Glu 
690 

Ser Asp Phe Asp 
705 

Glu Asn Ala Glu 

Gly Leu Asp Pro 
740 

Leu Asn Gin Gly 
755 

Leu Ala Val Gly 
770 

Phe His Val Asn 
785 

Gly Tyr Gin Asp 

Glu Gin Thr Tyr 
820 

Gly lie Gin Leu 
835 



Glu Arg Asn Cys 

Gin Cys He Glu 
440 

Gly Trp Ala Gly 
455 

Ser Tyr Pro Asp 
470 

Asp Asn Cys Lys 
485 

Asp Gly He Gly 

Leu Asn Glu Gin 
520 

Asn Ser Asp Lys 
535 

Val Leu Asn Asn 
550 

Ala Cys Asp Asp 
565 

Asn Cys Pro Lys 

Gly Val Gly Asp 
600 

Gin Ser Asp Val 
615 

Gin Asp Ser Asp 
630 

Thr Val He Asn 
645 

Asp Glu Cys Asp 

Pro Pro Gly Pro 
680 

Asp Ser Asn Ser 
695 

Gin Asp Gin Val 
710 

Val Thr Leu Thr 
725 

Glu Gly Asp Ala 

Met Glu He Val 
760 

Tyr Thr Ala Phe 
775 

Thr Gin Thr Asp 
790 

Ser Ser Ser Phe 
805 

Trp Gin Ala Thr 

Lys Ala Val Lys 
840 
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Arg Asn Pro Glu 
425 

Glu Arg Gin Gly 

Asp Gly Tyr He 
460 

Glu Glu Leu Pro 
475 

Tyr Val Pro Asn 
490 

Asp Ala Cys Asp 
505 

Asp Asn Cys Val 

Asp He Phe Gly 
540 

Asp Gin Lys Asp 
555 

Asp Met Asp Gly 
570 

Phe Pro Asn Arg 
585 

Ala Cys Asp Ser 

Asp Asn Asp Leu 
620 

Gly Asp Gly His 
635 

Ser Ala Gin Leu 
650 

Asp Asp Asp Asp 
665 

Asp Asn Cys Arg 

Asp Gly Val Gly 
700 

He Asp Arg lie 
715 

Asp Phe Arg Ala 
730 

Gin He Asp Pro 
745 

Gin Thr Met Asn 

Asn Gly Val Asp 
780 

Asp Asp Tyr Ala 
795 

Tyr Val Val Met 
810 

Pro Phe Arg Ala 
825 

Ser Lys Thr Gly 



Leu Asn Pro Cys 
430 

Asp Val Thr Cys 
445 

Cys Gly Lys Asp 

Cys Ser Ala Arg 
480 

Ser Gly Gin Glu 
495 

Glu Asp Ala Asp 
510 

Leu lie His Asn 
525 

Asp Ala Cys Asp 

Thr Asp Gly Asp 
560 

Asp Gly lie Lys 
575 

Asp Gin Arg Asp 
590 

Cys Pro Asp Val 
605 

Val Gly Asp Ser 

Gin Asp Ser Thr 
640 

Asp Thr Asp Lys 
655 

Asn Asp Gly lie 
670 

Leu Val Pro Asn 
685 

Asp lie Cys Glu 

Asp Val Cys Pro 
720 

Tyr Gin Thr Val 
735 

Asn Trp Val Val 
750 

Ser Asp Pro Gly 
765 

Phe Glu Gly Thr 

Gly Phe lie Phe 
800 

Trp Lys Gin Thr 
815 

Val Ala Glu Pro 
830 

Pro Gly Glu His 
845 
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Leu 


Arg 


Asn 


Ser 


Leu 


Trp 


His 


Thr 


Gly 


Asp 


Thr 


Ser 


Asp 


Gin 


vai 


Arg 




850 










855 










860 










Leu 


Leu 


Trp 


Lys 


Asp 


Ser 


Arg 


Asn 


Val 


Gly 


Trp 


Lys 


Asp 


Lys 


vai 


C r~ 

be r 


865 










870 










87 5 










o o n 
O oU 


Tyr 


Arg 


Trp 


Phe 


Leu 


Gin 


His 


Arg 


Pro 


Gin 


Val 


Gly 


Tyr 


He 


Arg 


vai 










885 










890 










one 




Arg 


Phe 


Tyr 


Glu 


Gly 


Ser 


Glu 


Leu 


Val 


Ala 


Asp 


Ser 


Gly Val 


Thr 


He 








900 










90 5 










910 






Asp 
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Met 


Arg 


Gly 


Gly 


Arg 
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Gly 


Val 
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Cys 
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915 
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925 








Glu 


Asn 


lie 


He 


Trp 


Ser 


Asn 
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Lys 
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Arg 


Cys 


Asn 


Asp 


Thr 
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930 










935 










940 










Pro 


Glu 


Asp 


Phe 


Gin 


Glu 


Phe 


Gin 
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Gin 


Asn 


Phe 


Asp 


Arg 


Phe 


Asp 


945 










950 
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960 


Asn 
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INFORMATION 


FOR 


SEQ 


ID : 


NO: 


5: 

















(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 
(A) DESCRIPTION: primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5 

GACTGAATTC CYAAYGCYAA CCAGGCHGAY CAYGAYAARG AYGG 44 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

( C ) STR ANDEDNE S S : s i ng 1 e 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 
(A) DESCRIPTION: primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6 
CTAGGAATTC CTGKCCDGGR GTGTTTCCKG TRTGCCA 
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(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 
(A) DESCRIPTION: primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 

AATGAGCAGG ACAACTGTGT 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 
(A) DESCRIPTION: primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8 



TGCTCAGTCT GCTTCCACAT 
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CLAIMS 

1. An isolated thrombospondin that has four, type 2 
domains or unique fragments thereof. 

2. An isolated thrombospondin that is free of type 1 
domains . 

3. An isolated thrombospondin that is free of regions 
of homology to procollagen. 

4. An isolated thrombospondin that has at least four, 
type 2 domains, that is free of type 1 domains, and that is 
free of regions of homology to procollagen. 

5. An isolated nucleic acid encoding a thrombospondin 
that has four, type 2 domains, or unique fragments thereof. 

6. An isolated nucleic acid encoding a thrombospondin 
that is free of type 1 domains, or unique fragments thereof. 

7. An isolated nucleic acid encoding a thrombospondin 
that is free of regions of homology to procollagen, or unique 
fragments thereof. 

8. An isolated nucleic acid encoding a thrombospondin 
that has four, type 2 domains, is free of type 1 domains, and 
is free of regions of homology to procollagen. 

9. An isolated nucleotide sequence encoding at least a 
portion of platelet thrombospondin, said portion having at 
least four, type 2 domains. 

10. The isolated nucleotide sequence of claim 9, 
encoding an amino acid sequence selected from the group 
consisting of SEQ ID NOS . : 2 and 4. 
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11. The isolated nucleotide sequence of claim 9, 
comprising SEQ ID N0.:1. 

12. The isolated nucleotide sequence of claim 9, 
comprising SEQ ID NO. : 3. 

13. The isolated nucleotide sequence of claim 9, said 
portion free of any type 1 domains. 

14. The isolated nucleotide sequence of claims 9 or 13, 
said portion free of regions of homology to procollagen. 

15. An isolated polypeptide comprising the expression 
product of a nucleotide sequence encoding at least a portion 
of a platelet thrombospondin gene, wherein said nucleotide 
sequence encodes four , type 2 domains . 

16. The isolated polypeptide of claim 15, said 
nucleotide sequence lacking a sequence encoding for type 1 
domains . 

17. The isolated polypeptide of claim 16, said 
nucleotide sequence lacking a sequence encoding for regions 
of homology to procollagen. 

18. An isolated polypeptide selected from the group 
consisting of SEQ ID NO.: 2 and 4. 

19. A probe capable of distinguishing thrombospondin-4 , 
from thrombospondins -1, and -2. 

20. The probe of claim 19, comprising a DNA sequence 
having at least four, type 2 domains. 

21. The probe of claims 19, comprising a DNA sequence 
lacking any type 1 domains. 
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22. The probe of claim 19, comprising a DNA sequence 
lacking a region of homology with procollagen. 

23. A recombinant vector, said vector having a 
nucleotide sequence for transcription into a messenger RNA 
encoding a thrombospondin of claims 1, 2, 3 or 4. 

24. A microorganism containing a recombinant expression 
vector, said vector comprising a nucleotide sequence encoding 
for a thrombospondin of claims 1, 2, 3 or 4. 

25. A nucleic acid sequence comprising a transcriptional 
promoter linked to a nucleic acid sequence encoding a 
thrombospondin that has at least four, type 2 domains, said 
nucleic acid sequence in an orientation which, upon 
transcription, results in a negative RNA transcript. 

26. The nucleic acid sequence of claim 25, said sequence 
free of type 1 domains . 

27. The nucleic acid sequence of claim 26, said 
nucleotide sequence free of regions of homology with 
procollagen . 

28. An antibody selectively reactive with thrombospondin 
polypeptide having at least four, type 2 domains. 

29. The antibody of claim 28, said thrombospondin free 
of type 1 domains . 

30. The antibody of claim 29, said platelet 
thrombospondin free of regions of homology with procollagen. 

31. A method for producing platelet thrombospondin 
po lypept ide , compr i s ing , 



WO 94/13794 



PCT/US93/11725 



-56- 

introducing an expression vector into a host, said vector 
containing a DNA sequence encoding at least a portion of a 
polypeptide characterized as platelet thrombospondin, said 
DNA sequence containing at least four, type 2 domains, said 
DNA sequence under control by regulatory regions functional 
in said host, whereby said polypeptide is expressed; 

allowing said host to express said polypeptide as an 
expression product; and 

isolating said expression product. 

32. The method of claim 31, wherein said expression 
vector provided to the host includes a DNA sequence selected 
from the group consisting of SEQ ID NO. : 1 and 3. 

33. The method of claim 31, wherein the expression 
vector provided to the host includes a DNA sequence free of 
type 1 domains. 

34. The method of claim 31, wherein the expression 
vector introduced into the host includes a DNA sequence free 
of regions of homology with procollagen. 

35. A method for inactivating a gene for platelet 
thrombospondin , compr i s ing : 

providing a construct including a nucleotide sequence 
encoding for at least a portion of platelet thrombospondin 
having at least four, type 2 domains, said sequence which, 
when inserted, inactivates production of said platelet 
thrombospondin, said construct further having a promoter 
operatively linked to said sequence; 

introducing said construct into a cell; 

allowing said construct to homologously recombine with 
complementary sequences of said cell; and 

selecting for cells lacking the ability to produce said 
platelet thrombospondin having at least four, type 2 domains. 
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36. The method of claim 35, wherein said introduced 
construct comprises a nucleotide sequence encoding for 
platelet thrombospondin that is free of type 1 domains. 

37. The method of claim 36, wherein said introduced 
construct comprises a thrombospondin nucleotide sequence 
encoding for platelet thrombospondin that is free of regions 
of homology with procollagen. 

38. The method of claim 35, wherein the step of 
introducing said contruct into a cell comprises introducing 
said construct in a mammalian stem cell. 

39. A transgenic non-human vertebrate animal, all of 
whose cells contain a nucleotide sequence encoding for 
platelet thrombospondin-4 . 

40. The transgenic animal of claim 39, wherein said 
polypeptide has at least four, type 2 domains. 

41. The transgenic animal of claim 39, wherein said 
polypeptide lacks any type 1 domains. 

42. The transgenic animal of claim 39, wherein said 
polypeptide lacks a region of homology to procollagen. 

43. A thrombospondin polypeptide expressed in heart and 
skeletal muscle and not expressed in placenta, liver, or 
kidney. 

44. The polypeptide of claim 43, wherein said 
polypeptide has at least four, type 2 domains. 

45. The polypeptide of claims 43 or 44, wherein said 
polypeptide lacks any type 1 domains. 
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46. The polypeptide of claim 45, wherein said 
polypeptide lacks a region of homology with procollagen. 
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